Daniel Kokotajlo comments on Where I agree and disagree with Eliezer

Daniel Kokotajlo 19 Jun 2022 20:48 UTC
LW: 10 AF: 7
1
AF
I think my take is roughly “What Paul would think if he had significantly shorter timelines.”
- paulfchristiano 19 Jun 2022 20:53 UTC
  LW: 30 AF: 13
  2
  AF Parent
  Do you think that some of my disagreements should change if I had shorter timelines?
  (As mentioned last time we talked, but readers might not have seen: I’m guessing ~15% on singularity by 2030 and ~40% on singularity by 2040.)
  What links here?
  - Daniel Kokotajlo 19 Jun 2022 21:56 UTC
    LW: 56 AF: 17
    31
    AF Parent
    I think most of your disagreements on this list would not change.
    However, I think if you conditioned on 50% chance of singularity by 2030 instead of 15%, you’d update towards faster takeoff, less government/societal competence (and thus things more likely to fail at an earlier, less dignified point), more unipolar/local takeoff, lower effectiveness of coordination/policy/politics-style strategies, less interpretability and other useful alignment progress, less chance of really useful warning shots… and of course, significantly higher p(doom).
    
    To put it another way, when I imagine what (I think) your median future looks like, it’s got humans still in control in 2035, sitting on top of giant bureaucracies of really cheap, really smart proto-AGIs that fortunately aren’t good enough at certain key skills (like learning-to-learn, or concept formation, or long-horizon goal-directedness) to be an existential threat yet, but are definitely really impressive in a bunch of ways and are reshaping the world economy and political landscape and causing various minor disasters here and there that serve as warning shots. So the whole human world is super interested in AI stuff and policymakers are all caught up on the arguments for AI risk and generally risks are taken seriously instead of dismissed as sci-fi and there are probably international treaties and stuff and also meanwhile the field of technical alignment has had 13 more years to blossom and probably lots of progress has been made on interpretability and ELK and whatnot and there are 10x more genius researchers in the field with 5+ years of experience already… and even in this world, singularity is still 5+ years away, and probably there are lots of expert forecasters looking at awesome datasets of trends on well-designed benchmarks predicting with some confidence when it will happen and what it’ll look like.
    
    This world seems pretty good to me, it’s one where there is definitely still lots of danger but I feel like >50% chance things will be OK. Alas it’s not the world I expect, because I think probably things will happen sooner and go more quickly than that, with less time for the world to adapt and prepare.
    - Eli Tyre 20 Jun 2022 23:17 UTC
      7 points
      7
      Parent
      I personally found this to be a very helpful comment for visualizing how things could go.
    - Noosphere89 3 Oct 2024 17:08 UTC
      4 points
      0
      Parent
      Re my own updates, I’d say that my own probability of 50-55% chance of singularity by 2030, using the knowledge about AI, alignment and governance we have now:
      
      Faster takeoff is correct, but nowhere near as fast as Eliezer’s usual stories.
      
      Somewhat less competence, but only somewhat, because of the MNM effect and the ridiculously strong control system that was essentially a collective intelligence that operated for at least several months, and more generally I believe that governments will respond harder as the problem gets more severe.
      
      IMO, we are probably going to get fairly concentrated takeoffs, but not totally unipolar takeoffs.
      
      Politics and coordination will be reasonably effective by default, because I expect the government and the public to wake up hard once AIs start automating a lot of stuff.
      
      IMO, most of the value of alignment and interpretability research will be gotten very near into the singularity, or even right on the event horizon of the transition from human to AI, for almost the same reasons why a whole lot of the percentage of capability research will be gotten, but also it’s quite surprising how much we got the low-hanging fruit of alignment such that we could well dream for bigger targets.
      
      Useful warning shots will definitely be less, but I also expect governments to wake up a lot more than they have right now once they realize that AI is automating everything.
  - Evan R. Murphy 21 Jun 2022 0:58 UTC
    2 points
    1
    Parent
    I’m guessing ~15% on singularity by 2030 and ~40% on singularity by 2040
    These figures surprise me, I thought that you believed in shorter timelines because from Agreements #8 in your post where you said “[Transformative AI] is more likely to be years than decades, and there’s a real chance that it’s months”, .
    ~40% by 2040 sounds like an expectation of transformative AI probably taking decades. (Unless I’m drawing a false equivalence between transformative AI and what you mean by “singularity”.)
    - paulfchristiano 21 Jun 2022 3:04 UTC
      8 points
      1
      Parent
      In agreement #8 I’m talking about the time from “large impact on the world” (say increasing GDP by 10%, automating a significant fraction of knowledge work, “feeling like TAI is near,” something like that) to “transformative impact on the world” (say singularity, or 1-2 year doubling times, something like that). I think right now the impact of AI on the world is very small compared to this standard.
      - Evan R. Murphy 21 Jun 2022 6:08 UTC
        1 point
        0
        Parent
        Thanks, that makes it more clear to me the two different periods of time you’re talking about.