Philosophy PhD student, worked at AI Impacts, then Center on Long-Term Risk, now OpenAI Futures/Governance team. Research interests include acausal trade, timelines, takeoff speeds & scenarios, decision theory, history, and a bunch of other stuff. I subscribe to Crocker’s Rules and am especially interested to hear unsolicited constructive criticism. http://sl4.org/crocker.html
Daniel Kokotajlo
Gradations of Agency
I think the passage you quote there is just totally correct though. If you turn the clock back ten years or more to when all that stuff was happening, Yudkowsky was the “AGI is really important and coming sooner than you think” end of the spectrum, and the other side seemed to be “AGI is either not ever going to be a thing, or not ever going to be important” and then the median opinion was something like “Plausibly it’ll be an important thing but it’s coming 50 − 100 years from now.” At least that’s my impression from the 9-ish years I’ve been lurking on LW and the 7-ish years I’ve been talking to people in the community. (gosh I’m old.)
In the passage you quote I interpret Yud as saying that when you compare his claims about AGI back then to claims that other rationalists and EAs were making, people like Hanson, with the benefit of hindsight his look closer to the truth. I think that’s correct. Of course the jury is still out, since most of the claims on both sides were about things that haven’t happened yet (AGI is still not here) but e.g. it’s looking pretty unlikely that uploads/ems will come first, it’s looking pretty unlikely that AGI will be an accumulation of specialized modules built by different subcontractors (like an f-35 fighter jet lol), it’s looking pretty likely that it’ll happen in the 20′s or 30′s instead of the 60′s or 70′s… most of all, it’s looking pretty likely that it’ll be a Big Deal, something we all should be thinking about and preparing for now.
OK, thanks!
Perhaps this explains my position better:
If I saw a Yudkowsky tweet saying “I have a great forecasting track record” or “I have a better forecasting track record than Metaculus” my immediate reaction would be “Lol no you don’t fuck off.” When I read the first few lines of your post, I expected to shortly see a pic of such a tweet as proof. In anticipation my “lol fuck you Yudkowsky” reaction already began to rise within me.
But then when I saw the stuff you actually quoted, it seemed… much more reasonable? In particular, him dumping on Metaculus for updating so hard on Gato seemed… correct? Metaculus really should have updated earlier, Gato just put together components that were already published in the last few years. So then I felt that if I had only skimmed the first part of your post and not read the actual post, I would have had an unfairly negative opinion of Yudkowsky, due to the language you used: “He has several times claimed to have a great forecasting track record.”
For what it’s worth, I agree that Yudkowsky is pretty rude and obnoxious & that he should probably get off Twitter if this is how he’s going to behave. Like, yes, he has alpha about this AI stuff; he gets to watch as the “market” gradually corrects and converges to his position. Yay. Good for him. But he’s basically just stroking his own ego by tweeting about it here; I don’t see any altruistic purpose served by it.
Yes, but I don’t think he claims to have a better forecasting track record than them. I think he would say he is epistemically better in general, but as you say he doesn’t participate on Metaculus, he barely has any track record to speak of, so he’d have to be pretty delusional to think his track record is better.
I too would claim such a thing, or something similar at least—I’d say that my forecasts about AGI are better than the typical Metaculus forecast about AGI; however, I would not claim to have a great forecasting track record or even a better forecasting track record than Metaculus, because (a) I don’t have much of a track record at all, and (b) there are lots of other non-AGI questions on metaculus and on those questions I expect to do worse than Metaculus on average, lacking expertise as I do. (Alas, the AGI questions have mostly not resolved yet and will not resolve for some years, so we can’t just check those.)
Yes, I agree with the points you make about e.g. the importance of track records, the importance of betting, etc. etc. No, I don’t expect you to take my word for anything (or Yudkowsky’s). Yes, I think it’s reasonable for outsiders / people who aren’t super familiar with the literature on AI to defer to Metaculus instead of me or Yudkowsky.
Wait, how is it possible for there to be no optimum at all? There’s only a finite number of possible settings of the 175 billion parameters; there has to be at least one setting such that no other setting has lower loss. (I don’t know much math, I’m probably misunderstanding what optimum means.)
he has several times claimed to have a great forecasting track record.
This seems like an unfair exaggeration, going off the quotes you pulled.
Right, OK.
Pre bio-anchors couldn’t you have at least thought that recapitulating evolution would be enough? Or are you counting that as part of the bio anchors framework?
Maybe I misinterpreted you and/or her sorry. I guess I was eyeballing Ajeya’s final distribution and seeing how much of it is above the genome anchor / medium horizon anchor, and thinking that when someone says “we literally could scale up 2020 algorithms and get TAI” they are imagining something less expensive than that (since arguably medium/genome and above, especially evolution, represents doing a search for algorithms rather than scaling up an existing algorithm, and also takes such a ridiculously large amount of compute that it’s weird to say we “could” scale up to it.) So I was thinking that probability mass in “yes we could literally scale existing algorithms” is probability mass below +12 OOMs basically. Wheras Ajeya is at 50% by +12. I see I was probably misunderstanding you; you meant scaling up existing algorithms to include stuff like genome and long-horizon anchor? But you agree it doesn’t include evolution, right?)
Ben Garfinkel made basically this same point in 2019.
Serious forecasts of AI timelines (such as Ajeya’s and mine) already factor this in.
Given your Prediction 2, it seems like maybe we are on the same page? You seem to be saying that a 1000x Gato would be AGI-except-limited-by-scale-of-training, so if we could just train it for a sufficiently long scale that it could learn to do lots of AI R&D, then we’d get full AGI shortly thereafter, and if we could train it for a sufficiently long scale that it could learn to strategically accumulate power and steer the world away from human control and towards something else, then we’d (potentially) cross an AI-induced-point-of-no-return shortly thereafter. This is about what I think. (I also think that merely 10x or even 100x probably wouldn’t be enough; 1000x is maybe my median.) What scale of training is sufficiently long? Well, that’s a huge open question IMO. I think probably 1000x the scale of current-Gato would be enough, but I’m very unsure.
FYI this seems to be roughly the same hypothesis that Ajeya Cotra discusses in her report, under the name of “Horizon Length.” Richard Ngo also endorses this hypothesis. I’m glad you are thinking about it, it’s probably the biggest crux about timelines and takeoff speeds.
I’m more willing to say “yes we literally could scale up 2020 algorithms and get TAI, given some engineering effort and enough good data, without any fundamental advances
Interesting, thanks, I thought you were much more in agreement with Ajeya’s view (and therefore similarly uncertain about the probability that 2020′s algorithms would scale up etc.) Do you in fact have shorter timelines than Ajeya now, or is there something else that pushes you towards longer timelines than her in a way that cancels out?
I agree that Gato won’t go FOOM if we train it on GitHub Copilot. However, naysayer predictions about “game over” have always turned out right and will continue to do so right up until it’s too late. So you won’t win any points in my book.
I’d be interested to hear predictions about what a 10x bigger Gato trained on 10x more data 10x more diverse would and wouldn’t be capable of, and ditto for 100x and 1000x.
Thanks for doing this! How prestigious is the journal Science and Engineering Ethics?
Deal! :)
How about we do a lazy bet: Neither of us runs the survey, but we agree that if such a survey is run and brought to our attention, the loser pays the winner?
Difficulty with this is that we don’t get to pick the operationalization. Maybe our meta-operationalization can be “<50% of respondents claim >10% probability of X, where X is some claim that strongly implies AI takeover or other irreversible loss of human control / influence of human values, by 2032.” How’s that sound?...but actually though I guess my credences aren’t that different from yours here so it’s maybe not worth our time to bet on. I actually have very little idea what the community thinks, I was just pushing back against the OP who seemed to be asserting a consensus without evidence.
I thought some of the “experts” Gato was trained on were not from-scratch models but rather humans—e.g. images and text generated by humans.
Relatedly, instead of using a model as the “expert” couldn’t you use a human demonstrator? Like, suppose you are training it to control a drone flying through a warehouse. Couldn’t you have humans fly the drones for a bit and then have it train on those demonstrations?It’s not really any more an “agent” than my hypothetical cloud drive with a bunch of SOTA models on it. Prompting GATO is the equivalent of picking a file from the drive; if I want to do a novel task, I still have to finetune, just as I would with the drive. (A real AGI, even a weak one, would know how to finetune itself, or do the equivalent.)
This is false if significant transfer/generalization starts to happen, right? A drive full of a bunch of SOTA models, plus a rule for deciding what to use, is worse than Gato to the extent that Gato is able to generalize few-shot or zero-shot to new tasks and/or insofar as Gato gets gains from transfer.
EDIT: Meta-comment: I think we are partially just talking past each other here. For example, you think that the question is ‘will it ever reach the Pareto frontier,’ which is definitely not the question I care about.
I came across this old Metaculus question, which confirms my memory of how my timelines changed over time:
30% by 2040 at first, then march 2020 I updated to 40%, then Aug 2020 I updated to 71%, then I went down a bit, and then now it’s up to 85%. It’s hard to get higher than 85% because the future is so uncertain; there are all sorts of catastrophes etc. that could happen to derail AI progress.
What caused the big jump in mid-2020 was sitting down to actually calculate my timelines in earnest. I ended up converging on something like the Bio Anchors framework, but with a lot less mass in the we-need-about-as-much-compute-as-evolution-used-to-evolve-life-on-earth region. That mass was instead in the holy-crap-we-are-within-6-OOMs region, probably because of GPT-3 and the scaling hypothesis. My basic position hasn’t changed much since then, just become incrementally more confident as more evidence has rolled in & as I’ve heard the counterarguments and been dissatisfied by them.
Good points; those do seem to be cases in which Hanson comes out better. As you say, it comes down to how heavily you weight the stuff Yudkowsky beat Hanson on vs. the stuff Hanson beat Yudkowsky on. I also want to reiterate that I think Yudkowsky is being obnoxious.
(I also agree that the historical bio anchors people did remarkably well & much better than Yudkowsky.)