Because the noise usually grows as the signal does. Consider Moore’s law for transistors per chip. Back when that number was about 10^4, the standard deviation was also small—say 10^3. Now that density is 10^8, no chips are going to be within a thousand transiators of each other, the standard deviation is much bigger (~10^7).
This means that if you’re trying to fit the curve, being off by 10^5 is a small mistake when preducting current transistor #, but a huge mistake when predicting past transistor #. It’s not rare or implausible now to find a chip with 10^5 more transistors, but back in the ’70s that difference is a huge error, impossible under an accurate model of reality.
A basic fitting function, like least squares, doesn’t take this into account. It will trade off transistors now vs. transistors in the past as if the mistakes were of exactly equal importance. To do better you have to use something like a chi squared method, where you explicitly weight the points differently based on their variance. Or fit on a log scale using the simple method, which effectively assumes that the noise is proportional to the signal.
As someone basically thinking alone (cue George Thoroughgood), I definitely would value more comments / discussion. But if someone has access to research retreats where they’re talking face to face as much as they want, I’m not surprised that they don’t post much.
Talking is a lot easier than writing, and more immediately rewarding. It can be an activity among friends. It’s more high-bandwidth to have a discussion face to face than it is over the internet. You can assume a lot more about your audience which saves a ton of effort. When talking, you are more allowed to bullshit and guess and handwave and collaboratively think with the other person, and still be interesting, wheras when writing your audience usually expects you to be confident in what you’ve written. Writing is hard, reading is hard, understanding what people have written is harder than understanding what people have said and if you ask for clarification that might get misunderstood in turn. This all applies to comments almost as much as to posts, particularly on technical subjects.
The two advantages writing has for me is that I can communicate in writing with people who I couldn’t talk to, and that when you write something out you get a good long chance to make sure it’s not stupid. When talking it’s very easy to be convincing, including to yourself, even when you’re confused. That’s a lot harder in writing.
To encourage more discussion in writing one could try to change the format to reduce these barriers as much as possible—trying to foster one-to-one or small group threads rather than one-to-many, forstering/enabling knowledge about other posters, creating a context that allows for more guesswork and collaborative thinking. Maybe one underutilized tool on current LW is the question thread. Question threads are great excuses to let people bullshit on a topic and then engage them in small group threads.
In regards to your posts on AI safety, I have two opinions.
1: Maybe choose titles that allow the reader to figure out what they’re getting into. I can’t read everything, so I’d much rather read something whose title lets me infer it’s about e.g. AI timelines. In general, I would like the point to be slightly more obvious throughout.
2: Don’t stop posting, but slow down posting. Eliezer cheated in four ways. He did it for more time per day than you can afford, he was often rehashing arguments he’d already put into text elsewhere, he rarely posted original technical work, and if he didn’t do a good job you wouldn’t know about him (while you have no such antropic selection). Your AI posts often raise questions but only scrape the surface of an answer—I would rather read fewer but deeper posts.
I’ve definitely noticed, in the very slow process of improving my social skills, that people (in general, and me in particular) don’t give nearly enough compliments or praise relative to the optimum. Past me just didn’t notice when there was a good place for a compliment—the skill that I improved was fundamentally a noticing skill. I also benefited a lot from understanding the psychological idea of validation—people want validation, not just praise for any old thing.
Re: working on a specific thing. I have more or less accepted that the amount of praise one gets will not fit one’s needs. There’s a fame effect that causes a fat tail, and no particular reward for merely trying, which I think is necessary given the number of non-experts and how easy it is to produce bad work without noticing it. I definitely have to work on intrinsic motivation.
Why do people react to fire alarms? It’s not just that they’re public—smoke is public too. One big factor is that we’ve had reacting to fire alarms drilled into us since childhood, a policy probably formulated after a few incidents of children not responding to fire alarms.
What this suggests is even if signals are unclear, maybe what we really need is training. If some semi-arbitrary advance is chosen, people may or may not change their behavior when that advance occurs, depending on whether they have been successfully trained to be able to change their behavior.
On the other hand, we should already be working on AI safety, and so attempting to set up a fire alarm may be pointless—we need people to already be evacuating and calling the firefighters.
Honestly? I feel like this same set of problems gets re-solved a lot. I’m worried that it’s a sign of ill health for the field.
I think we understand certain technical aspects of corrigibility (indifference and CIRL), but have hit a brick wall in certain other aspects (things that require sophisticated “common sense” about AIs or humans to implement, philosophical problems about how to get an AI to solve philosophical problems). I think this is part of what leads to re-treading old ground when new people (or a person wanting to apply a new tool) try to work on AI safety.
On the other hand, I’m not sure if we’ve exhausted Concrete Problems yet. Yes, the answer is often “just have sophisticated common sense,” but I think the value is in exploring the problems and generating elegant solutions so that we can deepen our understanding of value functions and agent behavior (like TurnTrout’s work on low-impact agents). In fact, Tom’s a co-author on a very good toy problems paper, many of which require similar sort of one-off solutions that still might advance our technical understanding of agents.
It’s not called econ 101 because it’s the only material you need.
Engaging with previous work on the subject is just like any other way of being less wrong—if you’re already convinced you’re right, it feels like a tedious box to be checked with no chance of influencing your conclusions. Yes, there is some signalling value to me, but the signalling has value precisely because I assign high probability that there is relevant, important prior work here. (EDIT: where “here” largely means the monetary policy bits, though I would still be positively signalled by some reference-dropping on the cultural stuff).
The problem, by which I mean the reason I would rather the scene had less of this mythic stuff, is that I subscribe to absolutely the meanest, smallest type of cynicism: things people love are dangerous.
Take political arguments. People love to have political arguments. If one considers the community in the abstract, then political arguments are great for the community—look at how much more discussion there is over on SSC these days!
I am, of course, assuming in this example that political arguments in internet comments are of little use. But I think there is a straightforward cause: political arguments can be of little use because people love them. If people didn’t love them, they would only have them when necessary.
People love myths. Or at least most of them, some of the time. That’s why the myths you hear about aren’t selected for usefulness.
Naturally, that one parameter has to be very precise in order to work—if you have 1000 bits of data, the parameter will take at least 1000 bits to write down.
Pretty cool scheme for fitting general scatterplots. You could do the same in higher dimensions, but intuitively it seems like you are actually anti-compressing the data. Their point about not measuring complexity by parameter count is made.
Here, you dropped this from the last bullet point at the end :)
A very clear walkthrough of full nonindexical conditioning. Thanks! I think there’s still a big glaring warning sign that this could be wrong, which is the mismatch with frequency (and, by extension, betting). Probability is logically prior to frequency estimation, but that doesn’t mean I think they’re decoupled. If your “probability” has zero application because your decision theory uses “likeliness weights” calculated an entirely different way, I think something has gone very wrong.
I think if you’ve gone wrong somewhere, it’s in trying to outlaw statements of the form “it is Monday today.”
Suppose on Monday the experimenters will give her a cookie after she answers the question, and on Tuesday the experimenters will give her ice cream. Do you really want to outlaw “in 5 minutes I will get a cookie” as a valid thing to have beliefs about?
In fact, I think you got it precisely backwards—probability distributions come from the assigner’s state of information, and therefore they must be built off of what the assigner actually knows. I don’t have access to some True Monday Detector, I only have access to my internal sense of time. “Now” is fundamental, “Monday” is the higher level construct. Similarly, I don’t have an absolute position sense—my probability distribution over things must always use relative coordinates (even if it’s “relative to the zero reading on this gauge here”) because there are no absolute coordinates available to me. I don’t have access to my mystical True Name, so I don’t know which of several duplicates is the Real Me unless I can describe it in relative terms like “the one who came first”—therefore “me” is fundamental, “the original Charlie” is the higher-level construct.
Anyhow, once you allow temporal information you go back to trying to trying to figure out what your model should say when you demand a MEE constraint on Monday vs. Tuesday.
Using only speed to evaluate models lands you with a lookup table that stores the results you want. So you have to trade off speed and length: The speediest table of math results has length O(N) and speed O(N) (maybe? Not at all sure). The shortest general program has length O(log(N)) and uncomputably fast-growing time. So if we think of a separable cost function F(length)+G(time), as long as F doesn’t grow super-fast nor G super-slow, eventually the lookup table will have better score than brute-force search.
Ideally you want to find some happy medium—this is reminding my of my old post on approximate induction.
A lot of great points!
I think we can separate the arguments into about three camps, based on their purpose (though they (EDIT: whoops, forgot a don’t) don’t all cleanly sit in one camp):
Arguments why progress might be generally fast: Hominid variation, Brain scaling.
Arguments why a local advantage in AI might develop: Intelligence explosion, One algorithm, Starting high, Awesome AlphaZero.
Arguments why a local advantage in AI could cause a global discontinuity: Deployment scaling, Train vs. test, Payoff thresholds, Human-competition threshold, Uneven skills.
These facts need to work together to get the thesis of a single disruptive actor to go through: you need there to be jumps in AI intelligence, you need them to be fairly large even near human intelligence, and you need those increases to translate into a discontinuous impact on the world. This framework helps me evaluate arguments and counterarguments—for example, you don’t just argue against Hominid variation as showing that there will be a singularity, you argue against its more limited implications as well.
Bits I didn’t agree with, and therefore have lots to say about:
The counterargument seems pretty wishy-washy. You say: “Positive feedback loops are common in the world, and very rarely move fast enough and far enough to become a dominant dynamic in the world.” How common? How rare? How dominant? Is global warming a dominant positive feedback loop because warming leads to increased water in the atmosphere which leads to more warming, and it’s going to have a big effect on the world? Or is it none of those, because Earth won’t get all that much warmer, because there are other well-understood effects keeping it in homeostasis?
More precisely, I think the argument from reference class that a positive feedback loop (or rather, the behavior that we approximate as a positive feedback loop) will be limited in time and space is hardly an argument at all—it practically concedes that the feedback loop argument works for the middle of the three camps above, but merely points out that it’s not also an argument that intelligence will be important. A strong argument against the intelligence feedback hypothesis has to argue that a positive feedback loop is unlikely.
One can obviously respond by emphasizing that objects in the reference class you’ve chosen (e.g. tipping back too far in your chair and falling) don’t generally impact the world, and therefore this is a reference class argument against AI impacting the world. But AI is not drawn uniformly from this reference class—the only reason we’re talking about it is because it’s been selected for the possibility of impacting the world. Failure to account for this selection pressure is why the strength of the argument seemed to change upon breaking it into parts vs. keeping it as a whole.
We agree that slow deployment speed can “smooth out” a discontinuous jump in the state of the art into a continuous change in what people actually experience. You present each section as a standalone argument, and so we also agree that fast deployment speed alone does not imply discontinuous jumps.
But I think keeping things so separate misses the point that fast deployment is among the necessary conditions for a discontinuous impact. There’s also risk, if we think of things separately, of not remembering these necessary conditions when thinking about historical examples. Like, we might look at the history of drug development, where drug deployment and adoption takes a few years, and costs falling to allow more people to access the treatment takes more years, and notice that even though there’s an a priori argument for a discontinuous jump in best practices, peoples’ outcomes are continuous on the scale of several years. And then, if we’ve forgotten about other necessary factors, we might just attribute this to some mysterious low base rate of discontinuous jumps.
The counterargument doesn’t really hold together. We start ex hypothesi with some threshold effect in usefulness (e.g. good enough boats let you reach another island). Then you say that it won’t cause a discontinuity in things we care about directly; people might buy better boats, but because of this producers will spend more effort making better boats and sell them more expensively, so the “value per dollar” doesn’t jump. But this just assumes without justification that the production eats up all the value—why can’t the buyer and the producer both capture part of the increase in value? The only way the theoretical argument seems to work is in equilibrium—which isn’t what we care about.
Nuclear weapons are a neat example, but may be a misleading one. Nuclear weapons could have had half the yield, or twice the yield, without altering much about when they were built—although if you’d disagree with this, I’d be interested in in hearing about it. (Looking at your link, it seems like nuclear weapons were in fact more expensive per ton of TNT when they were first built—and yet they were built, which suggests there’s something fishy about their fit to this argument).
I think we can turn this into a more general thesis: Research is often local, and often discontinuous, and that’s important in AI. Fields whose advance seems continuous on the several-year scale may look jumpy on the six-month scale, and those jumps are usually localized to one research team rather than distributed. You can draw a straight line through a plot of e.g. performance of image-recognition AI, but that doesn’t mean that at the times in between the points there was a program with that intermediate skill at image-recognition. This is important to AI if the scale of the jumps, and the time between them, allows one team to jump through some region (not necessarily a discontinuity) of large gain in effect and gain a global advantage.
The missing argument about strategy:
There’s one possible factor contributing to the likelihood of discontinuity that I didn’t see, and that’s the strategic one. If people think that there is some level of advantage in AI that will allow them to have an important global impact, then they might not release their intermediate work to the public (so that other groups don’t know their status, and so their work can’t be copied), creating an apparent discontinuity when they decide to go public, even if 90% of their AI research would have gotten them 90% of the taking-over-the-world power.