Someone who is interested in learning and doing good.
My Twitter: https://twitter.com/MatthewJBar
My Substack: https://matthewbarnett.substack.com/
Someone who is interested in learning and doing good.
My Twitter: https://twitter.com/MatthewJBar
My Substack: https://matthewbarnett.substack.com/
I think you missed some basic details about what I wrote. I encourage people to compare what Eliezer is saying here to what I actually wrote. You said:
If you think you’ve demonstrated by clever textual close reading that Eliezer-2018 or Eliezer-2008 thought that it would be hard to get a superintelligence to understand humans, you have arrived at a contradiction and need to back up and start over.
I never said that you or any other MIRI person thought it would be “hard to get a superintelligence to understand humans”. Here’s what I actually wrote:
Non-MIRI people sometimes strawman MIRI people as having said that AGI would literally lack an understanding of human values. I don’t endorse this, and I’m not saying this.
[...]
I agree that MIRI people never thought the problem was about getting AI to merely understand human values, and that they have generally maintained there was extra difficulty in getting an AI to care about human values. However, I distinctly recall MIRI people making a big deal about the value identification problem (AKA the value specification problem), for example in this 2016 talk from Yudkowsky.[3] The value identification problem is the problem of “pinpointing valuable outcomes to an advanced agent and distinguishing them from non-valuable outcomes”. In other words, it’s the problem of specifying a function that reflects the “human value function” with high fidelity.
I mostly don’t think that the points you made in your comment respond to what I said. My best guess is that you’re responding to a stock character who represents the people who have given similar arguments to you repeatedly in the past. In light of your personal situation, I’m actually quite sympathetic to you responding this way. I’ve seen my fair share of people misinterpreting you on social media too. It can be frustrating to hear the same bad arguments, often made from people with poor intentions, over and over again and continue to engage thoughtfully each time. I just don’t think I’m making the same mistakes as those people. I tried to distinguish myself from them in the post.
I would find it slightly exhausting to reply to all of this comment, given that I think you misrepresented me in a big way right out of the gate, so I’m currently not sure if I want to put in the time to compile a detailed response.
That said, I think some of the things you said in this comment were nice, and helped to clarify your views on this subject. I admit that I may have misinterpreted some of the comments you made, and if you provide specific examples, I’m happy to retract or correct them. I’m thankful that you spent the time to engage. :)
I’m curious if you have any thoughts on the effect regulations will have on AI timelines. To have a transformative effect, AI would likely need to automate many forms of management, which involves making a large variety of decisions without the approval of other humans. The obvious effect of deploying these technologies will therefore be to radically upend our society and way of life, taking control away from humans and putting it in the hands of almost alien decision-makers. Will bureaucrats, politicians, voters, and ethics committees simply stand idly by while the tech industry takes over our civilization like this?
On the one hand, it is true that cars, airplanes, electricity, and computers were all introduced with relatively few regulations. These technologies went on to change our lives greatly in the last century and a half. On the other hand, nuclear power, human cloning, genetic engineering of humans, and military weapons each have a comparable potential to change our lives, and yet are subject to tight regulations, both formally, as the result of government-enforced laws, and informally, as engineers regularly refuse to work on these technologies indiscriminately, fearing backlash from the public.
One objection is that it is too difficult to slow down AI progress. I don’t buy this argument.
A central assumption of the Bio Anchors model, and all hardware-based models of AI progress more generally, is that getting access to large amounts of computation is a key constraint to AI development. Semiconductor fabrication plants are easily controllable by national governments and require multi-billion dollar upfront investments, which can hardly evade the oversight of a dedicated international task force.
We saw in 2020 that, if threats are big enough, governments have no problem taking unprecedented action, quickly enacting sweeping regulations of our social and business life. If anything, a global limit on manufacturing a particular technology enjoys even more precedent than, for example, locking down over half of the world’s population under some sort of stay-at-home order.
Another argument states that the incentives to make fast AI progress are simply too strong: first mover advantages dictate that anyone who creates AGI will take over the world. Therefore, we should expect investments to accelerate dramatically, not slow down, as we approach AGI. This argument has some merit, and I find it relatively plausible. At the same time, it relies on a very pessimistic view of international coordination that I find questionable. A similar first-mover advantage was also observed for nuclear weapons, prompting Bertrand Russell to go as far as saying that only a world government could possibly deter nations from developing and using nuclear weapons. Yet, I do not think this prediction was borne out.
Finally, it is possible that the timeline you state here is conditioned on no coordinated slowdowns. I sometimes see people making this assumption explicit, and in your report you state that you did not attempt to model “the possibility of exogenous events halting the normal progress of AI research”. At the same time, if regulation ends up mattering a lot—say, it delays progress by 20 years—then all the conditional timelines will look pretty bad in hindsight, as they will have ended up omitting one of the biggest, most determinative factors of all. (Of course, it’s not misleading if you just state upfront that it’s a conditional prediction).
If you are completely unfamiliar with the actual science on obesity you probably think that’s dumb because obesity is caused by high-palatability foods. Read the first page linked if you’d prefer to know why that’s obviously wrong.
I admit to being, at present, persuaded by the high-palatability hypothesis, which I roughly translate into the following thesis: “The general rise in obesity is primarily explained by the rise of highly processed, addicting foods, which raises our natural set point, tricking our bodies into eating more calories than we ‘need’ before feeling full.”
I read the posts you linked (you referred to this one, right?), and I’m not convinced by them, but I’m open to people explaining why they think I’m still wrong.
First I’ll summarize the article briefly, and then respond to each point.
My brief summary
The series begins by outlining 8 mysteries:
Obesity has gotten a lot worse over time
Obesity abruptly got worse some time in the 1970s
There’s good evidence that we’re not winning the war against obesity
Hunter-gatherers don’t become obese
Lab animals and wild animals have also become obese over time
People and animals gain a lot of weight when exposed to palatable foods
People at higher altitudes seem to get obese at a lower frequency
Diets are not effective at reducing obesity, for nearly everyone
The series continues by arguing that CICO (as in calories in, calories out) cannot explain the current crisis, and cites an array of evidence that tries to argue against that model. Given the inadequacy of CICO as a model for weight gain, then, the reason for the current obesity crisis must be due to environmental contaminants, which neatly fit each of the 8 mysteries.
My interpretation of the mysteries
In my opinion, assuming the high-palatability hypothesis, very few of the mysteries are actually “mysteries” in the sense of being surprising.
For example, we can explain mystery 1 by saying that high-palatability foods have become more common over time (duh). We can explain 3 because very few people are effectively targeted by anti-obesity campaigns, and it’s intractable to simply ban high-palatability food (which is probably the only solution that would actually work on a large scale, short of advanced technology). We can explain 4 by pointing out that hunter-gatherers don’t eat high-palatability food. We can explain 6 for obvious reasons. We can explain 8 by pointing out that people don’t have unlimited willpower, and thus, don’t rigidly adhere to a dieting plan when given abundant choices to “cheat” and eat high-palatability food (which is highly addictive).
That leaves mysteries 2, 5 and 7, which I do think call out for more explanation. However,
Mystery 2 is practically equally mysterious under both the environmental contaminant hypothesis, and the high-palatability hypothesis, since by the author’s admission, they have little idea about what chemicals were abruptly introduced into the environment starting in the 1970s. At the same time, I found their argument that foods were palatable before the 1970s to be weak.
Sure, you can name a few palatable foods from before the 1970s (Oreos, Doritos, Twinkies, Coca-Cola), but I don’t find it particularly unlikely that the absolute number and variety of high-palatability foods has increased greatly since the 1970s, given the immense pressure for food corporations to hyper-optimize their food for consumption.
Mystery 5 is only a real mystery if indeed animals under controlled conditions are getting fatter over time. The author presents two sources for this claim.
Source one states in its abstract, “We examined samples collectively consisting of over 20 000 animals from 24 populations (12 divided separately into males and females) of animals representing eight species living with or around humans in industrialized societies.” The palatability hypothesis can elegantly explain what’s going on here. Animals who live near cities are exposed to human trash, and humans throw a lot of high-palatability food away. Animals eat the trash and get addicted to it, raising their set point, causing them to overconsume calories. Animals that live with humans get fed human-produced food.
Source two is about horses, and I lack a coherent explanation for the details. But, this is mostly because I don’t know how common it is for horses to eat hyper-palatable food, as I have very little experience with common horse-feeding practices. Overall I wasn’t able to find compelling evidence that animals in controlled conditions , that don’t eat high-palatability foods, are experiencing increasing rates of obesity. (Though, of course, I might have missed this evidence in the sources). [Edit: it looks like I was mistaken and the first source includes laboratory rats and mice in the study.]
As far as I can tell, the most surprising mystery is 7. The author presents impressive evidence regarding altitude anorexia, and studies that looked into alternative factors (including carbon dioxide and oxygen).
EDIT: I now think that oxygen is the leading culprit for altitude anorexia, even though the author says it isn’t. Their evidence against the oxygen hypothesis is the following: one study found a small effect, and another study was methodologically flawed. Putting aside the second study, the effect found in the first study was not small at all in my opinion; in fact, it found that people who exercised in a low oxygen environment lost about 60% more weight than those who didn’t! Scott Alexander has written about this and finds the oxygen hypothesis plausible.
Yet, all things considered, I still don’t think that enough alternative hypotheses have been explored to say that mystery 7 is anywhere near conclusive. It’s well-known that obesity rates vary by demographic groups, and that there are genetic confounders involved, and demographic groups are also not evenly distributed between high and low altitudes.
In poorer nations, such as China, it seems highly plausible to me that altitude correlates strongly with access to supermarkets and fast-food restaurants that carry lots of high-palatability foods. Urban centers are generally clustered in low-altitude areas, along coasts and alongside rivers. If people in urban areas are exposed to more high-palatability foods, as opposed to more traditional dishes, then it seems obvious that you’ll find a correlation between altitude and obesity. The contamination hypothesis is not needed to explain this fact.
My take on CICO
Given that I don’t find any of the mysteries very surprising (with the possible exception of 7), I don’t see why the contamination hypothesis falls out as a parsimonious explanation of the data. Admittedly, however, my main disagreement probably boils down to the section on the plausibility of CICO.
Being honest, I found many of the parts of the CICO post to be riddled with misleading statements, sometimes simply confusing CICO with the idea that diets and attempts-to-increase-willpower work (which I emphatically do not believe), or strawmanning CICO into a generic position that absolutely nothing other than calories and exercise matter, or that an excess 3500 calories precisely and linearly adds 1 pound of fat to your body.
Obviously other factors, including genetics, matter. Obviously diets do not work on a large scale. And obviously the formula is not as easy as “eating an extra 3500 calories always means you gain an extra pound, even extrapolated to people eating 10,000 calories a day.” None of these facts are strongly inconsistent with the high-palatability hypothesis as the dominant explanation of the data. In my opinion, these are quibbles, not knock-down arguments.
And, in any case, the author admits,
Sure, consumption in the US went from 2,025 calories per day in 1970 to 2,481 calories per day in 2010, a difference of 456 calories.
That’s a lot! As someone who has very carefully controlled my eating before, I saw first-hand how eating a 500 calorie deficit made me lose weight, and conversely, how eating a 500 surplus made me gain weight. The author seems quick to handwave this fact away, as if a few hundred calories can’t add up over time. Their interlude responding to objections on this point also seems handwavey to me, and doesn’t give any evidence inconsistent with the high-palatability hypothesis.
Conclusion
Given a biologically plausible mechanism, its consistency with practically all the “mysteries”, common sense, and general scientific wisdom (from what I gather), it seems highly likely to me that the high-palatability hypothesis is correct. This, in my opinion, diminishes the case that money should be spent investigating alternative hypotheses (though the value of being proven wrong might be so high that it’s worth it anyway).
It’s as good as time as any to re-iterate my reasons for disagreeing with what I see as the Yudkowskian view of future AI. What follows isn’t intended as a rebuttal of any specific argument in this essay, but merely a pointer that I’m providing for readers, that may help explain why some people might disagree with the conclusion and reasoning contained within.
I’ll provide my cruxes point-by-point,
I think raw intelligence, while important, is not the primary factor that explains why humanity-as-a-species is much more powerful than chimpanzees-as-a-species. Notably, humans were once much less powerful, in our hunter-gatherer days, but over time, through the gradual process of accumulating technology, knowledge, and culture, humans now possess vast productive capacities that far outstrip our ancient powers.
Similarly, our ability to coordinate through language also plays a huge role in explaining our power compared to other animals. But, on a first approximation, other animals can’t coordinate at all, making this distinction much less impressive. The first AGIs we construct will be born into a culture already capable of coordinating, and sharing knowledge, making the potential power difference between AGI and humans relatively much smaller than between humans and other animals, at least at first.
Consequently, the first slightly smarter-than-human agent will probably not be able to leverage its raw intelligence to unilaterally take over the world, for pretty much the same reason that an individual human would not be able to unilaterally take over a band of chimps, in the state of nature, despite the intelligence advantage of the human.
There’s a large range of human intelligence, such that it makes sense to talk about AI slowly going from 50th percentile to 99.999th percentile on pretty much any important general intellectual task, rather than AI suddenly jumping to superhuman levels after a single major insight. In cases where progress in performance does happen rapidly, the usual reason is that there wasn’t much effort previously being put into getting better at the task.
The case of AlphaGo is instructive here: improving the SOTA on Go bots is not very profitable. We should expect, therefore, that there will be relatively few resources being put into that task, compared to the overall size of the economy. However, if a single rich company, like Google, at some point does decide to invest considerable resources into improving Go performance, then we could easily observe a discontinuity in progress. Yet, this discontinuity in output merely reflects a discontinuity in inputs, not a discontinuity as a response to small changes in those inputs, as is usually a prerequisite for foom in theoretical models.
Hardware progress and experimentation are much stronger drivers of AI progress than novel theoretical insights. The most impressive insights, like backpropagation and transformers, are probably in our past. And as the field becomes more mature, it will likely become even harder to make important theoretical discoveries.
These points make the primacy of recursive self-improvement, and as a consequence, unipolarity in AI takeoff, less likely in the future development of AI. That’s because hardware progress and AI experimentation are, for the most part, society-wide inputs, which can be contributed by a wide variety of actors, don’t exhibit strong feedback loops on an individual level, and more-or-less have smooth responses to small changes in their inputs. Absent some way of making AI far better via a small theoretical tweak, it seems that we should expect smooth, gradual progress by default, even if overall economic growth becomes very high after the invention of AGI.
[Update (June 2023): While I think these considerations are still important, I think the picture I painted in this section was misleading. I wrote about my views of AI services here.] There are strong pressures—including the principle of comparative advantage, diseconomies of scale, and gains from specialization—that incentivize making economic services narrow and modular, rather than general and all-encompassing. Illustratively, a large factory where each worker specializes in their particular role will be much more productive than a factory in which each worker is trained to be a generalist, even though no one understands any particular component of the production process very well.
What is true in human economics will apply to AI services as well. This implies we should expect something like Eric Drexler’s AI perspective, which emphasizes economic production across many agents who trade and produce narrow services, as opposed to monolithic agents that command and control.
Having seen undeniable, large economic effects from AI, policymakers will eventually realize that AGI is important, and will launch massive efforts to regulate it. The current lack of concern almost certainly reflects the fact that powerful AI hasn’t arrived yet.
There’s a long history of people regulating industries after disasters—like nuclear energy—and, given the above theses, it seems likely that there will be at least a few “warning shots” which will provide a trigger for companies and governments to crack down and invest heavily into making things go the way they want.
(Note that this does not imply any sort of optimism about the effects of these regulations, only that they will exist and will have a large effect on the trajectory of AI)
The effect of the above points is not to provide us uniform optimism about AI safety, and our collective future. It is true that, if we accept the previous theses, then many of the points in Eliezer’s list of AI lethalities become far less plausible. But, equally, one could view these theses pessimistically, by thinking that they imply the trajectory of future AI is much harder to intervene on, and do anything about, relative to the Yudkowskian view.