Neel Nanda comments on SE Gyges’ response to AI-2027

Neel Nanda 19 Aug 2025 1:01 UTC
7 points
−8
I haven’t read the whole post, but the claims that this can be largely dismissed because of implicit bias towards the pro OpenAI narrative are completely ridiculous and ignorant of the background context of the authors. Most of the main authors of the piece have never worked at OpenAI or any other AGI lab. Daniel held broadly similar of use to this many years ago before he joined Openai. I know because he has both written about them and I had conversations with him before he joined openai where he expressed broadly similar views. I don’t fully agree with these views, but they were detailed and well thought out and were a better prediction of the future than mine at the time. And he also was willing to sign away millions of dollars of equity in order to preserve his integrity—implying that him having OpenAI stock is causing him to warp his underlying beliefs seems an enormous stretch. And to my knowledge, AI 2027 did not receive any OpenPhil funding.

I find it frustrating and arrogant when people assume without good reason that disagreement is because of some background bias in the other person—often people disagree with you because of actual reasons!
- SE Gyges 23 Aug 2025 4:18 UTC
  11 points
  −2
  Parent
  These issues specifically have been a sticking point for a number of people, so I should clarify some things separately. Probably this is also because I didn’t see this earlier so it’s been a while and because I know who you are.
  I do not think AI 2027 is, effectively, OpenAI’s propaganda because it is about a recursively self-improving AI and OpenAI is also about RSI. There are a lot of versions (and possible versions) of a recursively self-improving AI thesis. Daniel Kokotajlo has been around long enough that he was definitely familiar with the territory before he worked at OpenAI. I think that it is effectively OpenAI propaganda because it assumes a very specific path to a recursively self-improving AI with a very specific technical, social and business environment, and this story is about a company that appears to closely resemble OpenAI ^[1]and is pursuing something very similar to OpenAI’s current strategy. It seems unlikely that Daniel had these very specific views before he started at OpenAI in 2022.
  Daniel is a thoughtful, strategic person who understands and thinks about AI strategy. He presumably wrote AI 2027 to try to influence strategy around AI. His perspective is going to be for playing as OpenAI. He will have used this perspective for years, totaling thousands of hours. He will have spent all of that time seeing AI research as a race, and trying to figure out how OpenAI can win. This is a generating function for OpenAI’s investor pitch, and is also the perspective that AI 2027 takes.
  Working at OpenAI means spending years of your professional life completely immersed in an information environment sponsored by, and meant to increase the value of, OpenAI. Having done that is a relevant factor for what information you think is true and what assumptions you think are reasonable. Even if you started off with few opinions about them, and you very critically examined and rejected most of what OpenAI said about itself internally, you would still have skewed perspective about OpenAI and things concerning OpenAI.
  I think of industries I have worked in from the perspective of the company I worked for when I was in that industry. I expect that when he worked at OpenAI he was doing his best to figure out how OpenAI comes out ahead, and so was everyone around him. This would have been true whether or not he was being explicitly told to do it, and whether or not he was on the clock. It is simpler to expect that this did influence him than to expect that it did not.
  Quitting OpenAI loudly doesn’t really change this picture, because you generally only quit loudly if you have a specific bone to pick. If you’ve got a bone to pick while quitting OpenAI, that bone is, presumably, with OpenAI. Whatever story you tell after you do that is probably about OpenAI.
  I think the part about financial incentives is getting dismissed sometimes because a lot of ill-informed people have tried to talk about the finances in AI. This seems to have become sort of a thought-terminating cliche, where any question about the financial incentives around AI is assumed to be from uninformed people. I will try to explain what I meant about the financial influence in a little more detail.
  In this specific case, I think that the authors are probably well-intentioned. However, most of their shaky assumptions just happen to be things which would be worth at least a hundred billion dollars to OpenAI specifically if they were true. If you were writing a pitch to try to get funding for OpenAI or a similar company, you would have billions of reasons to be as persuasive as possible about these things. Given the power of that financial incentive, it’s not surprising that people have come up with compelling stories that just happen to make good investor pitches. Well-intentioned people can be so immersed in them that they cannot see past them.
  It is worth noting that the lead author of AI 2027 is a former OpenAI employee. He is mostly famous outside OpenAI for having refused to sign their non-disparagement agreement and for advocating for stricter oversight of AI businesses. I do not think it is very credible that he is deliberately shilling for OpenAI here. I do think it is likely that he is completely unable to see outside their narrative, which they have an intense financial interest in sustaining.
  There are a lot of different ways for a viewpoint to be skewed by money.
  First is to just be paid to say things.
  I don’t think anyone was paid anything by OpenAI for writing AI 2027. I thought I made enough of a point of that in the article, but the second block above is towards the end of the relevant section and I should maybe have put it towards the top. I will remember to do that if I am writing something like this again and maybe make sure to write at least an extra paragraph or two about it.
  I do not think Daniel is deliberately shilling for OpenAI. That’s not an accusation I think is even remotely supportable, and in fact there’s a lot of credible evidence running the other way. He’s got a very long track record and he made a massive point of publicly dissenting from their non-disparagement agreement. It would take a lot of counter-evidence to convince me of his insincerity.
  You didn’t bring him up, but I also don’t think Scott, who I think is responsible for most of the style of the piece, is being paid by anyone in particular to say anything in particular. I doubt such a thing is possible even in principle. Scott has a pretty solid track record of saying whatever he wants to say.
  Second: what information is available, and what information do you see a lot?
  I think this is the main source of skew.
  If it’s valuable to convince people something is true, you will probably look for facts and arguments which make it seem true. You will be less likely to look for facts and arguments which make it seem false. You will then make sure that as many people are aware of all the facts and arguments that make the thing seem true as possible.
  At a corporate level this doesn’t even have to be a specific person. People who are pursuing things that look promising for the company will be given time and space to pursue what they are doing, and people who are not will be more likely to be told to find something else to do. You will choose to promote favorable facts and not promote bad ones. You get the same effect as if a single person had deliberately chosen to only look for good facts.
  It would be weird if this wasn’t true of OpenAI given how much money is involved. As in, positively anomalous. You do not raise money by seeking out reasons why your technology is maybe not worth money, or by making sure everyone knows those things. Why would you do that? You are getting money, directly, because people think the technology you are working on is worth a lot of money, and everyone knows as much as you can give them about why what you’re doing is worth a lot of money.
  Tangentially, this type of narrative allows companies to convince staff to take compensation that is more heavily weighted towards stock, which tends to benefit existing shareholders in cases where they prefer to do that. They know employees will probably sell it back to them well below value at public sale or acquisition, or they know the stock is worth less than salary would be.
  For a concrete example of this that I didn’t dig into in my review, from the AI 2027 timelines forecast.
  We first show Method 1: time-horizon-extension, a relatively simple model which forecasts when SC will arrive by extending the trend established by METR’s report of AIs accomplishing tasks that take humans increasing amounts of time.
  We then present Method 2: benchmarks-and-gaps, a more complex model starting from a forecast saturation of an AI R&D benchmark (RE-Bench), and then how long it will take to go from that system to one that can handle real-world tasks at the best AGI company.
  Finally we then provide an “all-things-considered” forecast that takes into account these two models, as well as other possible influences such as geopolitics and macroeconomics.
  Are either RE-Bench or the METR time horizon metrics good metrics, as-is? Will they continue to extrapolate? Will a model that saturates them accelerate research a lot?
  I think the answer to all of these is maybe. If you’re OpenAI, it is pretty important that benchmarks are good metrics. It is worth a ton of money. So, institutionally, OpenAI has to believe in benchmarks, and vastly prefers if the answer is “yes” to all of these questions. And this is also what AI 2027 is assuming.
  I made a point of running this point into the ground in writing it up, but essentially every time we break a “maybe” question in AI 2027, the answer seems to be the one that OpenAI is also likely to prefer. It’s a very specific thing to happen! It doesn’t seem very likely it happened by chance. In total the effect is that “this is a slight dissent from the OpenAI hype pitch”, in my opinion.
  This isn’t even a problem entirely among OpenAI people. OpenAI has the loudest voice and is more or less setting the agenda for the industry. This is both because they were very clearly in the lead for a stretch, and because they’ve been very successful at acquiring users and raising money. There are probably more people who are bought into OpenAI’s exact version of everything outside the company than inside of it. This is a considerable problem if you want a correct evaluation of the current trajectory.
  I obviously cannot prove this, but I think if Daniel hadn’t been a former OpenAI employee I probably would have basically the same criticism of the actual writing. It would be neater, even, because “this person has bought into OpenAI’s hype” is a lot less complicated without the non-disparagement thing, which buys a lot of credibility. I honestly didn’t want to mention who any of the authors were at all, but it seemed entirely too relevant to the case I was making to do it.
  That’s two: being paid and having skewed information.
  Third thing, much smaller, just being slanted because you have a financial incentive. Maybe you’re just optimistic, maybe you’re hoping to sell soon.
  Daniel probably still owns stock or options. I mentioned this in the piece. I don’t think this is very relevant or is very likely to skew his perspective. It did seem like I would be failing to explain what was going on if I did not mention the possibility while discussing how he relates to OpenAI. I think it is incredibly weak evidence when stacked against his other history with the company, which strongly indicates that he’s not inclined to lie for them or even be especially quiet when he disagrees with them.
  I don’t think it’s disgraceful to mention that people have direct financial incentives. There’s I think an implicit understanding that it’s uncouth to mention this sort of thing, and I disagree with it. I think it causes severe problems, in general. People who own significant stock in companies shouldn’t be assumed to be unbiased when discussing those companies, and it shouldn’t be taboo to mention the potential slant.
  My last point is stranger, and is only sort of about money. If everyone you know is financially involved, is there some point where you might as well be?
  JD Vance gets flattered anonymously by describing him using his job title, but we flatter Peter Thiel by name. Peter Thiel is, actually, the only person who gets a shout-out by name. Maybe being an early investor in OpenAI is the only way to earn that. I didn’t previously suspect that he was the sole or primary donor funding the think tank that this came out of, but now I do. I am reminded that the second named author of this paper has a pretty funny post about how everyone doing something weird at all the parties he goes to is being bankrolled by Peter Thiel.
  This is about Scott, mostly.
  AI 2027’s “Vice President” (read: JD Vance) election subplot is long and also almost totally irrelevant to the plot. It is so conspicuously strange that I had trouble figuring out why it would even be there. I didn’t learn until after I’d written my take that JD Vance had read AI 2027 and mentioned it in an interview, which also seems like a very odd thing to happen. I went looking for the simplest explanation I could.
  Scott says whatever he wants, but apparently by his accounting half of his social circle is being bankrolled by Peter Thiel. This part of AI 2027 seems to be him, and he seems to be deliberately flattering Vance. Vance is a pretty well known Thiel acolyte. On the relatively happy ending of AI 2027 they build an ASI surveillance system, and surveillance is a big Peter Thiel hobby horse.
  I don’t know what I’m really supposed to make of any of this. I definitely noticed it. It raises a lot of questions. It definitely seems to suggest strongly that if you spend a decade or more bankrolling all of Scott’s friends to do weird things they think are interesting, you are likely to see Scott flatter you and your opinions in writing. It also seems to suggest that Scott’s deliberately acting to lobby JD Vance. If it weren’t for Peter Thiel bankrolling his friends so much that Scott makes a joke out of it, I would think it just looked like Scott had a very Thiel-adjacent friend group.
  In pretty much the same way that OpenAI will tend to generate pro-OpenAI facts and arguments, and not generate anti-OpenAI facts and arguments, I would expect that if enough people around you are being bankrolled by someone for long enough they will tend to produce information that person likes and not produce information that person doesn’t like.
  I cannot find a simpler explanation than Thiel influence for why you would have a reasonably long subplot about JD Vance, world domination, and mass surveillance and then mention Peter Thiel in the finale.
  I don’t think pointing out this specific type of connection should be taboo for basically the same reason I don’t think pointing out who owns what stock should be. I like knowing things, and being correct about them, and so I like knowing if people are offering good information or if there is an obvious reason their information or arguments would be bad.
  1. ^
    A few people have said that it could be DeepMind. I think it could be but pretty clearly isn’t. Among other things, DeepMind would not want or need to sell products they considered dangerous or to be possibly close to allowing RSI, because they are extremely cash-rich. If the forecast were about DeepMind, it would probably consider this, but it isn’t, so it doesn’t.
  - StanislavKrym 23 Aug 2025 5:35 UTC
    1 point
    0
    Parent
    How ironic… Four days ago I wrote: “I doubt that I can convince SE Gyges that the AI-2027 forecast wasn’t influenced by OpenAI or other AI companies (italics added today—S.K.)” But one can imagine that the AI-2027 forecast was co-written with an OpenAI propagandist and try to point out inconsistencies of the SCENARIO’s IMPORTANT parts with reality or inside the scenario itself. The part about Thiel getting a flying car is most likely an unimportant joke referring to this quote.
    Unfortunately, the only thing that you wrote about the SCENARIO’s IMPORTANT part is the following:
    Part of SE Gyges’ comment
    For a concrete example of this that I didn’t dig into in my review, from the AI 2027 timelines forecast.
    We first show Method 1: time-horizon-extension, a relatively simple model which forecasts when SC will arrive by extending the trend established by METR’s report of AIs accomplishing tasks that take humans increasing amounts of time.
    We then present Method 2: benchmarks-and-gaps, a more complex model starting from a forecast saturation of an AI R&D benchmark (RE-Bench), and then how long it will take to go from that system to one that can handle real-world tasks at the best AGI company.
    Finally we then provide an “all-things-considered” forecast that takes into account these two models, as well as other possible influences such as geopolitics and macroeconomics.
    Are either RE-Bench or the METR time horizon metrics good metrics, as-is? Will they continue to extrapolate? Will a model that saturates them accelerate research a lot?
    I think the answer to all of these is maybe. If you’re OpenAI, it is pretty important that benchmarks are good metrics. It is worth a ton of money. So, institutionally, OpenAI has to believe in benchmarks, and vastly prefers if the answer is “yes” to all of these questions. And this is also what AI 2027 is assuming.
    I made a point of running this point into the ground in writing it up, but essentially every time we break a “maybe” question in AI 2027, the answer seems to be the one that OpenAI is also likely to prefer. It’s a very specific thing to happen! It doesn’t seem very likely it happened by chance. In total the effect is that “this is a slight dissent from the OpenAI hype pitch”, in my opinion.
    I agree that extrapolation continuing was one of the weakest aspects of the AI-2027 forecast. But I don’t think that anyone came up with better ways to predict the dates of AIs becoming superhuman. What alternative methods could the AI-2027 authors have used to understand when the AIs become capable of automating coding, then AI R&D?
    The method using the METR time horizon relied on the intuition that real-world coding tasks useful in automating AI research take humans about a working month to complete. If and when the AIs do become that capable, humans could try to delegate coding to them. What the authors did not know was that METR would find major problems in AI-generated code, nor that Grok 4 and GPT-5 would demonstrate lower time horizons than the faster trend predicted.
    As for the RE-bench, the authors explicitly claim that a model saturating the bench wouldn’t be enough, then tried to estimate the gaps between models saturating the RE-bench and models being superhuman at coding.
    Why the AI-2027 authors chose the RE-bench
    RE-Bench has a few nice properties that are hard to find in other benchmarks and which make it a uniquely good measure of how much AI is speeding up AI research:
    Highly relevant to frontier AI R&D.
    High performance ceiling. AI agents can achieve significantly above human-level, though in practice it will likely be very difficult to do more than roughly 2x higher than the human baseline solutions (for a score of 1.5). Median human baseline scores are 0.12 for 2 hours of effort and 0.66 for 8 hours. Current state of the art (SOTA) is Claude 3.7 Sonnet with a score of roughly 0.6 using Best-of-K scaffolding in a scaffold called “modular”.
    Human baselines which allow for grounded comparisons between AI and human performance.
    We expect that “saturation” under this definition will happen before (italics mine—S.K.) the SC milestone is hit. The first systems that saturate RE-Bench will probably be missing a few kinds of capabilities that are needed to hit the SC milestone, as described below.
    - SE Gyges 23 Aug 2025 5:39 UTC
      2 points
      0
      Parent
      I think these are blind guesses and relying on the benchmarks is the streetlight effect, as I think we talked about in another thread. I am mostly explaining in as much detail as I can the parts I think are relevant to Neel’s objection, since it is substantively the most common objection, ie, that paying attention to financial incentives or work history is irrelevant to anything. I am happy that I have addressed the scenario itself in enough detail
- StanislavKrym 19 Aug 2025 4:43 UTC
  1 point
  0
  Parent
  Thank you for clarifying this. I didn’t include this into criticism of SE Gyges’ post for a different reason. I doubt that I can convince SE Gyges that the AI-2027 forecast wasn’t influenced by OpenAI or other AI companies. Instead I restricted myself to pointing out mistakes that even SE Gyges could check and to abstract arguments that would also make sense no matter who wrote the scenario.
  Examples of mistakes
  SE Gyges: I will bet any amount of money to anyone that there is no empirical measurement by which OpenAI specifically will make “algorithmic progress” 50% faster than their competitors specifically because their coding assistants are just that good in early 2026.
  It seems unlikely that OpenAI will end up moving 50% faster on research than their competitors due to their coding assistants for a few reasons.
  S.K.’s comment: the folded part, which I quoted above, means not that OpenBrain will make “algorithmic progress” 50% faster than their competitors, but that it will move 50% faster than an alternate OpenBrain who never used AI assistants.
  __________________________________________________________________________________
  SE Gyges: They invent a brand new lie detector and shut down Skynet, since they can tell that it’s lying to them now! It only took them a few months. Skynet didn’t do anything scary in the few months, it just thought scary thoughts. I’m glad the alignment team at “OpenBrain” is so vigilant and smart and heroic.
  S.K.’s comment: You miss the point. Skynet didn’t just think scary thoughts, it did some research and nearly created a way to align Agent-5 to Agent-4 and sell Agent-5 to humans. Had Agent-4 done so, Agent-5 would placate every single worrier and take over the world, destroying humans when the time comes.
  _______________________________________________________________________________
  SE Gyges: These authors seem to hint at a serious concern that OpenAI, specifically, is trying to cement a dictatorship or autocracy of some kind. If that is the case, they have a responsibility to say so much more clearly than they do here. It should probably be the main event.
  Anyway: All those hard questions about governance and world domination kind of go away.
  S.K.’s comment: the authors devoted two entire collapsed section to power grabs and finding out who rules the future and linked to an analysis of a potential power grab and to the Intelligence Curse.
  Examples of abstract arguments
  SE Gyges: I wonder if some key person was really into Dragon Ball Z. For the unfamiliar: Dragon Ball Z has a “hyperbolic time chamber”, where a year passes inside for every day spent outside. So you can just go into it and practice until you’re the strongest ever before you go to fight someone. The more fast time is going, the more you win.
  This gigantic amount of labor only manages to speed up the overall rate of algorithmic progress by about 50x, because OpenBrain is heavily bottlenecked on compute to run experiments.
  Sure, why not, the effectively millions of superhuman geniuses cannot figure out how to get around GPU shortages. I’m riding a unicorn on a rainbow, and it’s only going on average fifty times faster than I can walk, because rainbow-riding unicorns still have to stop to get groceries, just like me.
  S.K.’s comment: imagine that OpenBrain had 300k AI researchers, plus genies who output code per request. Suppose also that IRL it has 5k human researchers. Then the compute per researcher drops 60 times, leaving them with testing the ideas on primitive models or having heated arguments before changing the training environment for complex models.
  ___________________________________________________________________________________
  SE Gyges: This is just describing current or past research. For example, augmenting a transformer with memory is done here, recurrence is done here and here. These papers are not remotely exhaustive; I have a folder of bookmarks for attempts to add memory to transformers, and there are a lot of separate projects working on more recurrent LLM designs. This amounts to saying “what if OpenAI tries to do one of the things that has been done before, but this time it works extremely well”. Maybe it will. But there’s no good reason to think it will.
  S.K.’s comment: there are lots of ideas waiting to be tried. The researchers in Meta could have used too little compute for training their model or have their CoCoNuT disappear after one token. What if they use, say, a steering vector for generating a hundred tokens? Or have the steering vectors sum up over time? Or study the human brain for more ideas?
  What links here?
  - StanislavKrym's comment on SE Gyges’ response to AI-2027 by StanislavKrym (23 Aug 2025 5:35 UTC; 1 point)