What seems entirely clear?
If they are not a socially acceptable alarm, they aren’t fire alarms, according to the definition set out by Yudkowsky in the linked post. Zvi can use the word differently if he likes, but since he linked to the Yudkowsky piece I thought it worth mentioning that Yudkowsky means something different by it than Zvi does.
Your 90% is 2025? This suggests that you assign <10% credence to the disjunction of the following:
--The “Scaling hypothesis” is false. (One reason it could be false: Bigger does equal better, but transformative AI requires 10 orders of magnitude more compute, not 3. Another reason it could be false: We need better training environments, or better architectures, or something, in order to get something actually useful. Another reason it could be false: The scaling laws paper predicted the scaling of GPTs would start breaking down right about now. Maybe it’s correct.)
--There are major engineering hurdles that need to be crossed before people can train models 3+ orders of magnitude bigger than GPT-3.
--There is some other bottleneck (chip fab production speed?) that slows things down a few more years.
--There is some sort of war or conflict that destroys (or distracts resources from) multibillion-dollar AI projects.
Seems to me that this disjunction should get at least 50% probability. Heck, I think my credence in the scaling hypothesis is only about 50%.
Just chiming in to say I agree with Villiam. I actually read your whole post just now, and I thought it was interesting, and making an important claim… but I couldn’t follow it well enough to evaluate whether or not it is true, or a good argument, so I didn’t vote on it. I like the suggestion to break up your stuff into smaller chunks, and maybe add more explanation to them also.
For example, you could have made your first post something like “Remember that famous image of a black hole? Guess what: It may have been just hallucinating signal out of noise. Here’s why.” Then your next post could be: “Here’s a list of examples of this sort of thing happening again and again in physics, along with my general theory of what’s wrong with the epistemic culture of physics.” I think for my part at least, I didn’t have enough expertise to evaluate your claims about whether these examples really were false positives, nor enough expertise to evaluate whether they were just cherry-picked failures or indicative of a deeper problem in the community. If you had walked through the arguments in more detail, and maybe explained some key terms like “noise floor” etc., then I wouldn’t have needed expertise and would be able to engage more productively.
I think the purpose of the OT and ICT is to establish that lots of AI safety needs to be done. I think they are successful in this. Then you come along and give your analogy to other cases (rockets, vaccines) and argue that lots of AI safety will in fact be done, enough that we don’t need to worry about it. I interpret that as an attempt to meet the burden, rather than as an argument that the burden doesn’t need to be met.
But maybe this is a merely verbal dispute now. I do agree that OT and ICT by themselves, without any further premises like “AI safety is hard” and “The people building AI don’t seem to take safety seriously, as evidenced by their public statements and their research allocation” and “we won’t actually get many chances to fail and learn from our mistakes” does not establish more than, say, 1% credence in “AI will kill us all,” if even that. But I think it would be a misreading of the classic texts to say that they were wrong or misleading because of this; probably if you went back in time and asked Bostrom right before he published the book whether he agrees with you re the implications of OT and ICT on their own, he would have completely agreed. And the text itself seems to agree.
Thanks for the constructive criticism. I didn’t cite things because this isn’t research, just a dump of my thoughts. Now that you mention it, yeah I probably should have talked about area denial tech. I had considered saying something about EMP/jamming/etc., I don’t know why I didn’t...
For the other stuff you mention, well, I didn’t think that fit the purpose of the post. The title is “What a 20-year lead in military tech might look like,” not “What military conflicts will look like in 20 years.”
I mean, I did say as much in the OP basically. Defense is always more cost-efficient than offense; the problem is that when you sacrifice your mobility you have to spread out your forces whereas the enemy can concentrate theirs.
For attacking drones, winning means the same thing it usually does. I don’t see how it would be different.
I think it’s a reasonable and well-articulated worry you raise.
My response is that for the graphing calculator, we know enough about the structure of the program and the way in which it will be enhanced that we can be pretty sure it will be fine. In particular, we know it’s not goal-directed or even building world-models in any significant way, it’s just performing specific calculations directly programmed by the software engineers.
By contrast, with GPT-3 all we know is that it’s a neural net that was positively reinforced to the extent that it correctly predicted words from the internet during training, and negatively reinforced to the extent that it didn’t. So it’s entirely possible that it does, or will eventually, have a world-model and/or goal-directed behavior. It’s not guaranteed, but there are arguments to be made that “eventually” it would have both, i.e. if we keep making it bigger and giving it more internet text and training it for longer. I’m rather uncertain about the arguments that it would have goal-directed behavior, but I’m fairly confident in the argument that eventually it would have a really good model of the world. The next question is then how this model is chosen. There are infinitely many world-models that are equally good at predicting any given dataset, but that diverge in important ways when it comes to predicting whatever is coming next. It comes down to what “implicit prior” is used. And if the implicit prior is anything like the universal prior, then doom. Now, it probably isn’t the universal prior. But maybe the same worries apply.
One scenario that worries me: At first the number of AIs is small, and they aren’t super smart, so they mostly just host normal human memes and seem as far as we (and even they) can tell to be perfectly aligned. Then, they get more widely deployed, and now there are many AIs and maybe they are smarter also, and alas it turns out that AIs are a different environment than humans, in a way which was not apparent until now. So different memes flourish and spread in the new environment, and bad things happen.
I was imagining an AI that is less smart and innovative than Elon Musk, but thinks faster and can be copied arbitrarily many times. Elon is limited to one brain thinking eighteen hours a day or so; the vast majority of the thought-work is done by the other humans he commands. If we had AI that could think as well as a pretty good engineer, but significantly faster, and we could make lots of copies fairly cheaply, then we’d be able to do things much faster than Elon can with his teams of humans.
Interesting, could you provide a link?
Another interpretation is that the predictions were right about what was possible, but wrong about how long it would take. For example, Kurzweil’s 1999 predictions of what 2009 would look like were mostly wrong, but if instead you pretend they are predictions about 2019 they are almost entirely correct.
What about a membership system? Like a union or guild? People pay their dues and agree to support the collective actions, and the union uses the funds to hire people to investigate cases and plan strategy etc. Like with insurance, someone too likely to cause trouble might not be allowed in. (This way we prevent the group from quickly being dominated by real witches.)
Yes? Not all of it, but definitely much of it is. It’s unfair to complain about GPT-3′s lack of ability to simulate you to get out of the box, etc. since it’s way too stupid for that, and the whole point of AI safety is to prepare for when AI systems are smart. There’s a whole chunk of the literature now on “Prosaic AI safety” which is designed to deal with exactly the sort of thing GPT-3 is pretty much. And even the more abstract agent foundations stuff is still relevant; for example, the “Universal prior is malign” stuff shows that in the limit GPT-N would likely be catastrophic, and that insight was gleaned from thinking a lot about solomonoff induction, which is a very agent-foundationsy thing to be doing.
Good question. I think Johnswentworths’ comment is an excellent explanation of why so many high-tech nations have lost to lower-tech insurgencies in recent decades. I think the standard distinction is between “conventional” and “unconventional” warfare. If I were to write a list of near-future military tech focused on unconventional warfare, the sorts of things I would emphasize would be different (e.g. I’d emphasize persuasion tools, command and control, intel, and maybe battle bots.).
I’m not sure the X-year-lead idea is useful for describing vietnam, afghanistan, etc. In those cases at least the insurgency was heavily supported by a technologically advanced superpower. So they were technologically behind in most areas, but in some of the areas that were most important, they were totally up to date. And then also they were a different kind of war entirely, one focused almost entirely on winning hearts and minds rather than on killing and blowing up stuff.
I know less about the Italy-Ethiopia conflicts.
Another thing worth saying is that a 20-year lead today is a much bigger deal than a 20-year lead in, say, 1500. Probably also a bigger deal than a 20-year lead in 1900.
Mmmm, yes. I think it’s too quick to say that all modern war is like this; I think type 2 is definitely still possible and type 1 shouldn’t be completely ruled out either, especially if we are thinking about crazy future AI scenarios. But other than that I agree, and found your trichotomy insightful. Thanks!
I’d be interested to hear more about what you have in mind!
If we put Elon Musk in charge of making it happen and gave him a hundred billion dollars, I think it could all be achieved in 5 − 10 years. If we had advanced-but-not-godlike AI, it would take less than a year. That’s my prediction anyway.