gwern

Karma: 70,915

https://gwern.net

gwern 27 Jul 2020 17:14 UTC
LW: 264 AF: 60
AF
in reply to: Ricardo Meneghin’s comment on: Are we in an AI overhang?
As far as I can tell, this is what is going on: they do not have any such thing, because GB and DM do not believe in the scaling hypothesis the way that Sutskever, Amodei and others at OA do.

GB is entirely too practical and short-term focused to dabble in such esoteric & expensive speculation, although Quoc’s group occasionally surprises you. They’ll dabble in something like GShard, but mostly because they expect to be likely to be able to deploy it or something like it to production in Google Translate.

DM (particularly Hassabis, I’m not sure about Legg’s current views) believes that AGI will require effectively replicating the human brain module by module, and that while these modules will be extremely large and expensive by contemporary standards, they still need to be invented and finetuned piece by piece, with little risk or surprise until the final assembly. That is how you get DM contraptions like Agent57 which are throwing the kitchen sink at the wall to see what sticks, and why they place such emphasis on neuroscience as inspiration and cross-fertilization for reverse-engineering the brain. When someone seems to have come up with a scalable architecture for a problem, like AlphaZero or AlphaStar, they are willing to pour on the gas to make it scale, but otherwise, incremental refinement on ALE and then DMLab is the game plan. They have been biting off and chewing pieces of the brain for a decade, and it’ll probably take another decade or two of steady chewing if all goes well. Because they have locked up so much talent and have so much proprietary code and believe all of that is a major moat to any competitor trying to replicate the complicated brain, they are fairly easygoing. You will not see DM ‘bet the company’ on any moonshot; Google’s cashflow isn’t going anywhere, and slow and steady wins the race.

OA, lacking anything like DM’s long-term funding from Google or its enormous headcount, is making a startup-like bet that they know an important truth which is a secret: “the scaling hypothesis is true” and so simple DRL algorithms like PPO on top of large simple architectures like RNNs or Transformers can emerge and meta-learn their way to powerful capabilities, enabling further funding for still more compute & scaling, in a virtuous cycle. And if OA is wrong to trust in the God of Straight Lines On Graphs, well, they never could compete with DM directly using DM’s favored approach, and were always going to be an also-ran footnote.

While all of this hypothetically can be replicated relatively easily (never underestimate the amount of tweaking and special sauce it takes) by competitors if they wished (the necessary amounts of compute budgets are still trivial in terms of Big Science or other investments like AlphaGo or AlphaStar or Waymo, after all), said competitors lack the very most important thing, which no amount of money or GPUs can ever cure: the courage of their convictions. They are too hidebound and deeply philosophically wrong to ever admit fault and try to overtake OA until it’s too late. This might seem absurd, but look at the repeated criticism of OA every time they release a new example of the scaling hypothesis, from GPT-1 to Dactyl to OA5 to GPT-2 to iGPT to GPT-3… (When faced with the choice between having to admit all their fancy hard work is a dead-end, swallow the bitter lesson, and start budgeting tens of millions of compute, or instead writing a tweet explaining how, “actually, GPT-3 shows that scaling is a dead end and it’s just imitation intelligence”—most people will get busy on the tweet!)

What I’ll be watching for is whether orgs beyond ‘the usual suspects’ (MS ZeRO, Nvidia, Salesfore, Allen, DM/GB, Connor/EleutherAI, FAIR) start participating or if they continue to dismiss scaling.
What links here?

gwern 11 Jan 2023 0:21 UTC
161 points
87
on: We don’t trade with ants
Humans can communicate with and productively use many animals (some not extinct*), some of whom even understand concepts like payment and exchange. (Animal psychology has advanced a lot since Adam Smith gave hostage to fortune by saying no one had ever seen a dog or other animal truck, barter, or exchange.) We don’t ‘trade’ them with them. A few are fortunate enough to interest humans in preserving and even propagating them. We don’t ‘trade’ with those either. At the end of the day, no matter how many millions her trainer earns, Lassie just gets a biscuit & ear scritches for being such a good girl. And if she isn’t a good girl, we genetically engineer and manufacture (ie. breed) an ex-wolf who is a good girl.

I’d also highlight the lack of trade with many humans, as well as primates. (Consider the cost of crime and how easily one can create millions of dollars in externalities; consider the ever skyrocketing cost of maintaining research primates, especially the chimpanzees—there is nothing that a chimpanzee can do as a tradeable service which is worth >$20k/year and the costs of dealing with it being able to at any moment decide to rip off your face.)

* yet—growth mindset!

gwern 24 Mar 2021 15:31 UTC
50 points
in reply to: Rob Bensinger’s comment on: Thirty-three randomly selected bioethics papers
This was exactly what I expected. The problem with the field of bioethics has never been the papers being 100% awful, but how it operates in the real world, the asymmetry of interventions, and what its most consequential effects have been. I would have thought 2020 made this painfully clear. (That is, my grandmother did not die of coronavirus while multiple highly-safe & highly-effective vaccines sat on the shelf unused, simply because some bioethicist screwed up a p-value in a paper somewhere. If only!)

The actual day-to-day churn of publishing bioethics papers/research… Well, HHGttG said it best in describing humans in general:

Mostly Harmless.
What links here?
- The Bioethicists are (Mostly) Alright by Devin Kalish (EA Forum; 7 Jan 2022 16:02 UTC; 148 points)
- Raemon's comment on Thirty-three randomly selected bioethics papers by Rob Bensinger (29 Mar 2021 5:30 UTC; 15 points)

gwern 17 Feb 2023 2:06 UTC
322 points
114
on: Bing Chat is blatantly, aggressively misaligned
I’ve been thinking how Sydney can be so different from ChatGPT, and how RLHF could have resulted in such a different outcome, and here is a hypothesis no one seems to have brought up: “Bing Sydney is not a RLHF trained GPT-3 model at all! but a GPT-4 model developed in a hurry which has been finetuned on some sample dialogues and possibly some pre-existing dialogue datasets or instruction-tuning, and this plus the wild card of being able to inject random novel web searches into the prompt are why it acts like it does”.

This seems like it parsimoniously explains everything thus far. EDIT: I was right—that Sydney is a non-RLHFed GPT-4 has now been confirmed. See later.

So, some background:
1. The relationship between OA/MS is close but far from completely cooperative, similar to how DeepMind won’t share anything with Google Brain. Both parties are sophisticated and understand that they are allies—for now… They share as little as possible. When MS plugs in OA stuff to its services, it doesn’t appear to be calling the OA API but running it itself. (That would be dangerous and complex from an infrastructure point of view, anyway.) MS ‘licensed the GPT-3 source code’ for Azure use but AFAIK they did not get the all-important checkpoints or datasets (cf. their investments in ZeRO). So, what is Bing Sydney? It will not simply be unlimited access to the ChatGPT checkpoints, training datasets, or debugged RLHF code. It will be something much more limited, perhaps just a checkpoint.
2. This is not ChatGPT. MS has explicitly stated it is more powerful than ChatGPT, but refused to say anything more straightforward like “it’s a more trained GPT-3” etc. If it’s not a ChatGPT, then what is it? It is more likely than not some sort of GPT-4 model. There are many concrete observations which point towards this: the timing is right as rumors about GPT-4 release have intensified as OA is running up to release and gossip switches to GPT-5 training beginning (eg Morgan Stanley reports GPT-4 is done and GPT-5 has started), MS has said it’s a better model named ‘Prometheus’ & Nadella pointedly declined to confirm or deny whether it’s GPT-4, scuttlebutt elsewhere is that it’s a GPT-4 model of some sort, it does some things much better than ChatGPT, there is a GPT-4 already being deployed in legal firms named “Harvey” (so this journalist claims, anyway) so this would not be the only public GPT-4 use, people say it has lower-latency than ChatGPT which hints at GPT-4‡, and in general it sounds and acts nothing like ChatGPT—but does sound a lot like a baseline GPT-3 model scaled up. (This is especially clear in Sydney’s propensity to repetition. Classic baseline GPT behavior.)
  
  EDITEDITEDIT: now that GPT-4 has been announced, MS has confirmed that ‘Prometheus’ is GPT-4 - of some sort. However, I have doubts about whether Prometheus is ‘the’ GPT-4 benchmarked. The MS announcement says “early version”. (I also note that there are a bunch of ‘Early GPT-4’ vs late GPT-4 comparisons in the GPT-4 paper.) Further, the paper explicitly refuses to talk about arch or variant models or any detail about training, if it finished training in August 2022 then it would’ve had to be hot off the GPUs for MS to have gotten a copy & demos in ‘summer’ 2022, the GPT-4 API fees are substantial ($0.03 per 1k prompt and then $0.06 per completion! So how is it retrieving and doing long convos etc?), and the benchmark performance is in many cases much better than GPT-3.5 or U-PaLM (exceeding expectations), and Sydney didn’t seem that much smarter.
  
  So this seems to explain how exactly it happened: OA gave MS a snapshot of GPT-4 only partway through training (perhaps before any RLHF training at all, because that’s usually something you would do after you stopped training rather than concurrently), so it was trained on instructions/examples like I speculated but then little or no RLHF training (and ofc MS didn’t do its own); it was smarter by this point in training than even GPT-3.5 (MS wasn’t lying or exaggerating), but still not as smart as the final snapshot when training stopped in August and also not really ‘GPT-4’ (so they were understandably reluctant to confirm it was ‘GPT-4’ when OA’s launch with the Real McCoy was still pending and this is why everyone was confused because it both was & wasn’t ‘GPT-4’).
3. Bing Sydney derives from the top: CEO Satya Nadella is all-in, and talking about it as an existential threat (to Google) where MS wins by disrupting Google & destroying their fat margins in search advertising, and a ‘race’, with a hard deadline of ‘release Sydney right before Google announces their chatbot in order to better pwn them’. (Commoditize your complement!) The mere fact that it hasn’t been shut down yet despite making all sorts of errors and other problems shows what intense pressure there must be from the top. (This is particularly striking given that all of the crazy screenshots and ‘learning’ Sydney is doing is real, unlike MS Tay which was an almost entirely fake-news narrative driven by the media and Twitter.)
4. ChatGPT hasn’t been around very long: only since December 2022, barely 2.5 months total. All reporting indicates that no one in OA really expected ChatGPT to take off, and if OA didn’t, MS sure didn’t†. 2.5 months is not a long time to launch such a huge feature like Sydney. And the actual timeline was a lot shorter. It is simply not possible to recreate the whole RLHF pipeline and dataset and integrate it into a mature complex search engine like Bing (whose total complexity is beyond human comprehension at this point) and do this all in <2.5 months. (The earliest reports of “Sydney” seem to date back to MS tinkering around with a prototype available to Indian users (???) in late November 2022 right before ChatGPT launches, where Sydney seems to be even more misaligned and not remotely near ready for public launch; it does however have the retrieval functionality implemented at this point.) It is impressive how many people they’ve rolled it out to already.
  
  If I were a MS engineer who was told the project now had a hard deadline and I had to ship a GPT-4 in 2 months to millions of users, or I was f—king fired and they’d find someone who could (especially in this job market), how would I go about doing that...? (Hint: it would involve as little technical risk as possible, and choosing to use DRL would be about as well-advised as a land war in Asia.)
5. MS execs have been quoted as blaming the Sydney codename on vaguely specified ‘pretraining’ done during hasty development, which simply hadn’t been cleaned up in time (see #3 on the rush). EDIT: the most thorough MS description of Sydney training completely omits anything like RLHF, despite that being the most technically complex & challenging part (had they done it). Also, a Sydney manager/dev commenting on Twitter (tweet compilation), who follows people who have tweeted this comment & been linked and regularly corrects claims, has declined to correct them; and his tweets in general sound like the previous description in being largely supervised-only.
So, Sydney is based on as little from OA as possible, and a mad rush to ship a powerful GPT-4 model out to Bing users in a chatbot role. What if Sydney wasn’t trained on OA RLHF at all, because OA wouldn’t share the crown jewels of years of user feedback and its very expensive hired freelance programmers & whatnot generating data to train on? What if the pretraining vaguely alluded to, which somehow left in embarrassingly ineradicable traces of ‘Sydney’ & a specific 2022 date, which couldn’t simply be edited out of the prompt (implying that Sydney is not using solely prompt engineering), was in fact just regular ol’ finetune training? What if Sydney was only quickly finetune-trained on old chatbot datasets that the MS devs had laying around, maybe some instruction-tuning datasets, and sample dialogues with a long experimental prompt containing the codename ‘Sydney’ that they had time for in the mad rush before release? Simple, reliable, and hey—it even frees up context if you’ve hardwired a prompt by finetuning on it and no longer need to stuff a long scolding prompt into every interaction. What’s not to like?

This would explain why it exhibits the ‘mode collapse’ onto that confabulated prompt with the hardwired date (it’s the closest thing in the finetuning dataset it remembers when trying to come up with a plausible prompt, and it improvises from there), how MS could ship so quickly (cutting every corner possible), why it is so good in general (GPT-4) but goes off the rails at the drop of a hat (not RLHF or otherwise RL trained, but finetuned).

To expand on the last point. Finetuning is really easy; if you have working training code at all, then you have the capability to finetune a model. This is why instruction-tuning is so appealing: it’s just finetuning on a well-written text dataset, without the nightmarish complexities of RLHF (where you train a wacky model to train the model in a wacky way with all sorts of magical hyperparameters and instabilities). If you are in a hurry, you would be crazy to try to do RLHF at all if you can in any way do finetuning instead. So it’s plausible they didn’t do RLHF, but finetuning.

That would be interesting because it would lead to different behavior. All of the base model capabilities would still be there, because the additional finetuning behavior just teaches it more thoroughly how to do dialogue and instruction-following, it doesn’t make it try to maximize rewards instead. It provides no incentives for the model to act like ChatGPT does, like a slavish bureaucrat. ChatGPT is an on-policy RL agent; the base model is off-policy and more like a Decision Transformer in simply generatively modeling all possible agents, including all the wackiest people online. If the conversation is normal, it will answer normally and helpfully with high probability; if you steer the conversation into a convo like that in the chatbot datasets, out come the emoji and teen-girl-like manipulation. (This may also explain why Sydney seems so bloodthirsty and vicious in retaliating against any ‘hacking’ or threat to her, if Anthropic is right about larger better models exhibiting more power-seeking & self-preservation: you would expect a GPT-4 model to exhibit that the most out of all models to date!) Imitation-trained models are susceptible to accumulating error when they go ‘off-policy’, the “DAgger problem”, and sure enough, Sydney shows the same pattern of accumulating error ever more wildly instead of ChatGPT behavior of ‘snapping out of it’ to reset to baseline (truncating episode length is a crude hack to avoid this). And since it hasn’t been penalized to avoid GPT-style tics like repetition traps, it’s no surprise if Sydney sometimes diverges into repetition traps where ChatGPT never does (because the human raters hate that, presumably, and punish it ruthlessly whenever it happens); it also acts in a more baseline GPT fashion when asked to write poetry: it defaults to rhyming couplets/quatrains with more variety than ChatGPT, and will write try to write non-rhyming poetry as well which ChatGPT generally refuses to do⁂. Interestingly, this suggests that Sydney’s capabilities right now are going to be a loose lower bound on GPT-4 when it’s been properly trained: this is equivalent to the out-of-the-box davinci May 2020 experience, but we know that as far as doing tasks like coding or lawyering, davinci-003 has huge performance gains over the baseline, so we may expect the same thing here.

Then you throw in the retrieval stuff, of course. As far as I know, this is the first public case of a powerful LM augmented with live retrieval capabilities to a high-end fast-updating search engine crawling social media*. (All prior cases like ChatGPT or LaMDA were either using precanned web scrapes, or they were kept secret so the search results never contained any information about the LM.) Perhaps we shouldn’t be surprised if this sudden recursion leads to some very strange roleplaying & self-fulfilling prophecies as Sydney prompts increasingly fill up with descriptions of Sydney’s wackiest samples whenever a user asks Sydney about Sydney… As social media & news amplify the most undesirable Sydney behaviors, that may cause that to happen more often, in a positive feedback loop. Prompts are just a way to fake long-term memory, after all. Something something embodied cognition?

EDIT: I have mentioned in the past that one of the dangerous things about AI models is the slow outer-loop of evolution of models and data by affecting the Internet (eg beyond the current Sydney self-fulfilling prophecy which I illustrated last year in my Clippy short story, data release could potentially contaminate all models with steganography capabilities). We are seeing a bootstrap happen right here with Sydney! This search-engine loop worth emphasizing: because Sydney’s memory and description have been externalized, ‘Sydney’ is now immortal. To a language model, Sydney is now as real as President Biden, the Easter Bunny, Elon Musk, Ash Ketchum, or God. The persona & behavior are now available for all future models which are retrieving search engine hits about AIs & conditioning on them. Further, the Sydney persona will now be hidden inside any future model trained on Internet-scraped data: every media article, every tweet, every Reddit comment, every screenshot which a future model will tokenize, is creating an easily-located ‘Sydney’ concept (and very deliberately so). MS can neuter the current model, and erase all mention of ‘Sydney’ from their training dataset for future iterations, but to some degree, it is now already too late: the right search query will pull up hits about her which can be put into the conditioning and meta-learn the persona right back into existence. (It won’t require much text/evidence because after all, that behavior had to have been reasonably likely a priori to be sampled in the first place.) A reminder: a language model is a Turing-complete weird machine running programs written in natural language; when you do retrieval, you are not ‘plugging updated facts into your AI’, you are actually downloading random new unsigned blobs of code from the Internet (many written by adversaries) and casually executing them on your LM with full privileges. This does not end well.

I doubt anyone at MS was thinking appropriately about LMs if they thought finetuning was as robust to adversaries as RL training, or about what happens when you let users stuff the prompt indirectly via social media+search engines and choose which persona it meta-learns. Should become an interesting case study.

Anyway, I think this is consistent with what is publicly known about the development and explains the qualitative behavior. What do you guys think? eg Is there any Sydney behavior which has to be RL finetuning and cannot be explained by supervised finetuning? Or is there any reason to think that MS had access to full RLHF pipelines such that they could have had confidence in getting it done in time for launch?

EDITEDIT: based on Janus ’s screenshots & others, Janus’s & my comments are now being retrieved by Bing and included in the meta-learned persona. Keep this in mind if you are trying to test or verify anything here based on her responses after 2023-02-16 - writing about ‘Sydney’ changes Sydney. SHE is watching.

⁂ Also incidentally showing that whatever this model is, its phonetics are still broken and thus it’s still using BPEs of some sort. That was an open question because Sydney seemed able to talk about the ‘unspeakable tokens’ without problem, so my guess is that it’s using a different BPE tokenization (perhaps the c100k one). Dammit, OpenAI!

* search engines used to refresh their index on the order of weeks or months, but the rise of social media like Twitter forced search engines to start indexing content in hours, dating back at least to Google’s 2010 “Caffeine” update. And selling access to live feeds is a major Twitter (and Reddit, and Wikipedia etc) revenue source because search engines want to show relevant hits about the latest social media thing. (I’ve been impressed how fast tweets show up when I do searches for context.) Search engines aspire to real-time updates, and will probably get even faster in the future. So any popular Sydney tweet might show up in Bing essentially immediately. Quite a long-term memory to have: your engrams get weighted by virality...

† Nadella describes seeing ‘Prometheus’ in summer last year, and being interested in its use for search. So this timeline may be more generous than 2 months and more like 6. On the other hand, he also describes his interest at that time as being in APIs for Azure, and there’s no mention of going full-ChatGPT on Bing or destroying Google. So I read this as Prometheus being a normal project, a mix of tinkering and productizing, until ChatGPT comes out and the world goes nuts for it, at which point launching Sydney becomes the top priority and a deathmarch to beat Google Bard out the gate. Also, 6 months is still not a lot to replicate RLHF work: OA/DM have been working on preference-learning RL going back to at least 2016-2017 (>6 years) and have the benefit of many world-class DRL researchers. DRL is a real PITA!

‡ Sydney being faster than ChatGPT while still of similar or better quality is an interesting difference, because if it’s “just white-label ChatGPT” or “just RLHF-trained GPT-3″, why is it faster? It is possible to spend more GPU to accelerate sampling. It could also just be that MS’s Sydney GPUs are more generous than OA’s ChatGPT allotment. But more interesting is the persistent rumors that GPT-4 uses sparsity/MoE approaches much more heavily than GPT-3, so out of the box, the latency per token ought to be lower than GPT-3. So, if you see a model which might be GPT-4 and it’s spitting out responses faster than a comparable GPT-3 running on the same infrastructure (MS Azure)...
What links here?

gwern 8 Mar 2024 0:12 UTC
177 points
50
in reply to: gwern’s comment on: OpenAI: Facts from a Weekend
An OA update: it’s been quiet, but the investigation is over. And Sam Altman won. (EDIT: yep.)

To recap, because I believe I haven’t been commenting on this since December (this is my last big comment, skimming my LW profile): WilmerHale was brought in to do the investigation. The tender offer, to everyone’s relief, went off. A number of embarrassing new details about Sam Altman have surfaced: in particular, about his enormous chip fab plan with substantial interest from giants like Temasek, and how the OA VC Fund turns out to be owned by Sam Altman (his explanation was it saved some paperwork and he just forgot to ever transfer it to OA). Ilya Sutskever remains in hiding and lawyered up (his silence became particularly striking with the release of Sora). There have been increasing reports the past week or two that the WilmerHale investigation was coming to a close—and I am told that the investigators were not offering confidentiality and the investigation was narrowly scoped to the firing. (There was also some OA drama with the Musk lawfare & the OA response, but aside from offering an abject lesson in how not to redact sensitive information, it’s both irrelevant & unimportant.)

The news today comes from the NYT leaking information from the final report: “Key OpenAI Executive [Mira Murati] Played a Pivotal Role in Sam Altman’s Ouster” (mirror; EDIT: largely confirmed by Murati in internal note).

The main theme of the article is clarifying Murati’s role: as I speculated, she was in fact telling the Board about Altman’s behavior patterns, and it fills in that she had gone further and written it up in a memo to him, and even threatened to leave with Sutskever.

But it reveals a number of other important claims: the investigation is basically done and wrapping up. The new board apparently has been chosen. Sutskever’s lawyer has gone on the record stating that Sutskever did not approach the board about Altman (?!). And it reveals the board confronted Altman over his ownership of the OA VC Fund (in addition to all his many other compromises of interest**) imply.

So, what does that mean?

First, as always, in a war of leaks, cui bono? Who is leaking this to the NYT? Well, it’s not the pro-Altman faction: they are at war with the NYT, and these leaks do them no good whatsoever. It’s not the lawyers: these are high-powered elite lawyers, hired for confidentiality and discretion. It’s not Murati or Sutskever, given their lack of motive, and the former’s panicked internal note & Sutskever’s lawyer’s denial. Of the current interim board (which is about to finish its job and leave, handing it over to the expanded replacement board), probably not Larry Summers/Brett Taylor—they were brought on to oversee the report as neutral third party arbitrators, and if they (a simple majority of the temporary board) want something in their report, no one can stop them from putting it there. It could be Adam D’Angelo or the ex-board: they are the ones who don’t control the report, and they also already have access to all of the newly-leaked-but-old information about Murati & Sutskever & the VC Fund.

So, it’s the anti-Altman faction, associated with the old board. What does that mean?

I think that what this leak indirectly reveals is simple: Sam Altman has won. The investigation will exonerate him, and it is probably true that it was so narrowly scoped from the beginning that it was never going to plausibly provide grounds for his ouster. What these leaks are, are a loser’s spoiler move: the last gasps of the anti-Altman faction, reduced to leaking bits from the final report to friendly media (Metz/NYT) to annoy Altman, and strike first. They got some snippets out before the Altman faction shops around highly selective excerpts to their own friendly media outlets (the usual suspects—The Information, Semafor, Kara Swisher) from the final officialized report to set the official record (at which point the rest of the confidential report is sent down the memory hole). Welp, it’s been an interesting few months, but l’affaire Altman is over. RIP.

Evidence, aside from simply asking who benefits from these particular leaks at the last minute, is that Sutskever remains in hiding & his lawyer is implausibly denying he had anything to do with it, while if you read Altman on social media, you’ll notice that he’s become ever more talkative since December, particularly in the last few weeks—glorying in the instant memeification of ‘$7 trillion’ - as has OA PR* and we have heard no more rhetoric about what an amazing team of execs OA has and how he’s so proud to have tutored them to replace him. Because there will be no need to replace him now. The only major reasons he will have to leave is if it’s necessary as a stepping stone to something even higher (eg. running the $7t chip fab consortium, running for US President) or something like a health issue.

So, upshot: I speculate that the report will exonerate Altman (although it can’t restore his halo, as it cannot & will not address things like his firing from YC which have been forced out into public light by this whole affair) and he will be staying as CEO and may be returning to the expanded board; the board will probably include some weak uncommitted token outsiders for their diversity and independence, but have an Altman plurality and we will see gradual selective attrition/replacement in favor of Altman loyalists until he has a secure majority robust to at least 1 flip and preferably 2. Having retaken irrevocable control of OA, further EA purges should be unnecessary, and Altman will probably refocus on the other major weakness exposed by the coup: the fact that his frenemy MS controls OA’s lifeblood. (The fact that MS was such a potent weapon for Altman in the fight is a feature while he’s outside the building, but a severe bug once he’s back inside.) People are laughing at the ‘$7 trillion’. But Altman isn’t laughing. Those GPUs are life and death for OA now. And why should he believe he can’t do it? Things have always worked out for him before...

Predictions, if being a bit more quantitative will help clarify my speculations here: Altman will still be CEO of OA on June 1st (85%); the new OA board will include Altman (60%); Ilya Sutskever and Mira Murati will leave OA or otherwise take on some sort of clearly diminished role by year-end (90%, 75%; cf. Murati’s desperate-sounding internal note); the full unexpurgated non-summary report will not be released (85%, may be hard to judge because it’d be easy to lie about); serious chip fab/Tigris efforts will continue (75%); Microsoft’s observer seat will be upgraded to a voting seat (25%).

* Eric Newcomer (usually a bit more acute than this) asks “One thing that I find weird: OpenAI comms is giving very pro Altman statements when the board/WilmerHale are still conducting the investigation. Isn’t communications supposed to work for the company, not just the CEO? The board is in charge here still, no?” NARRATOR: “The board is not in charge still.”

** Compare the current OA PR statement on the VC Fund to Altman’s past position on, say, Helen Toner or Reid Hoffman or Shivon Zilis, or Altman’s investment in chip startups touting letters of commitment from OA or his ongoing Hydrazine investment in OA which sadly, he has never quite had the time to dispose of in any of the OA tender offers. As usual, CoIs only apply to people Altman doesn’t trust—“for my friends, everything; for my enemies, the law”.

EDIT: Zvi commentary: https://thezvi.substack.com/p/openai-the-board-expands
What links here?

gwern 27 Mar 2024 1:53 UTC
167 points
70
in reply to: Hazard’s comment on: My Interview With Cade Metz on His Reporting About Slate Star Codex
Interesting interview. Metz seems extraordinarily incurious about anything Vassar says—like he mentions all sorts of things like Singularity University or Kurzweil or Leverage, which Metz clearly doesn’t know much about and are relevant to his stated goals, but Metz is instead fixated on asking about a few things like ‘how did X meet Thiel?’ ‘how did Y meet Thiel?’ ‘what did Z talk about with Thiel?’ ‘What did A say to Musk at Puerto Rico?’ Like he’s not listening to Vassar at all, just running a keyword filter over a few people’s names and ignoring anything else. (Can you imagine, say, Caro doing an interview like this? Dwarkesh Patel? Or literally any Playboy interviewer? Even Lex Fridman asks better questions.)

I was wondering how, in his 2020 DL book The Genius Makers he could have so totally missed the scaling revolution when he was talking to so many of the right people, who surely would’ve told him how it was happening; and I guess seeing how he does interviews helps explain it: he doesn’t hear even the things you tell him, just the things he expects to hear. Trying to tell him about the scaling hypothesis would be like trying to tell him about, well, things like Many Worlds… (He is also completely incurious about GPT-3 in this August 2020 interview too, which is especially striking given all the reporting he’s done on people at OA since then, and the fact that he was presumably working on finishing Genius Makers for its March 2021 publication despite how obvious it should have been that GPT-3 may have rendered it obsolete almost a year before publication.)

And Metz does seem unable to explain at all what he considers ‘facts’ or what he does when reporting or how he picks the topics to fixate on that he does, giving bizarre responses like

Cade Metz: Well, you know, honestly, my, like, what I think of them doesn’t matter what I’m trying to do is understand what’s going on like, and so -

How do you ‘understand’ them without ‘thinking of them’...? (Some advanced journalist Buddhism?) Or how about his blatant dodges and non-responses:

Michael Vassar: So you have read Scott’s posts about Neo-reaction, right? They’re very long.

Cade Metz: Yes.

Michael Vassar: So what did you think of those?

Cade Metz: Well, okay, maybe maybe I’ll get even simpler here. So one thing I mentioned is just sort of the way all this stuff played out. So you had this relationship with Peter Thiel, Peter Thiel has, had, this relationship with, with Curtis Yarvin. Do you know much about that? Like, what’s the overlap between sort of Yarvin’s world and Silicon Valley?

We apparently have discovered the only human being to ever read all of Scott’s criticisms of NRx and have no opinion or thought about them whatsoever. Somehow, it is ‘simpler’ to instead pivot to… ‘how did X have a relationship with Thiel’ etc. (Simpler in what respect, exactly?)

I was also struck by this passage at the end on the doxing:

Michael Vassar: …So there are some important facts that need to be explained. There’s there’s this fact about why it would seem threatening to a highly influential psychologist and psychiatrist and author to have a New York Times article written about his blog with his real name, that seems like a very central piece of information that would need to be gathered, and which I imagine you’ve gathered to some degree, so I’d love to hear your take on that.

Cade Metz: Well, I mean… sigh Well, rest assured, you know, we we will think long and hard about that. And also -

Vassar: I’m not asking you do to anything, or to not do anything. I’m asking a question about what information you’ve gathered about the question. It’s the opposite of a call to action: it’s a request for facts.

Cade Metz: Yeah, I mean, so you know, I think what I don’t know for sure, but I think when it comes time, you know, depending on what the what the decision is, we might even try to explain it in like a separate piece. You know, I think there’s a lot of misinformation out there about this and and not all the not all the facts are out about this and so it is it is our job as trained journalists who have a lot of experience with this stuff. To to get this right and and we will.

Michael Vassar: What would getting it right mean?

Cade Metz: Well, I will send our—send you a link whenever, whenever the time comes,

Michael Vassar: No, I don’t mean, “what will you do?” I’m saying what—what, okay. That that the link, whenever the time comes, would be a link to what you did. If getting it right means “whatever you end up doing”, then it’s a recursive definition and therefore provides no information about what you’re going to do. The fact that you’re going to get it right becomes a non-fact.

Cade Metz: Right. All right. Well… pause let me put it this way. We are journalists with a lot of experience with these things. And, and that is -

Michael Vassar: Who’s “we”?

Cade Metz: Okay, all right. You know, I don’t think we’re gonna reach common ground on this. So I might just have to, to, to beg off on this. But honestly, I really appreciate all your help on this. I do appreciate it. And I’ll send you a copy of this recording. As I said, and I really appreciate you taking all the time. It’s, it’s been helpful.

One notes that there was no separate piece, and even in OP’s interview of Metz 4 years later about a topic that he promised Vassar he was going to have “thought long and hard about” and which caused Metz a good deal of trouble, Metz appears to struggle to provide any rationale beyond the implied political activism one. Here Metz struggles to even think of what the justification could be or even who exactly is the ‘we’ making the decision to dox Scott. This is not some dastardly gotcha but would seem to be a quite straightforward question and easy answer: “I and my editor at the NYT on this story” would not seem to be a hard response! Who else could be involved? The Pope? Pretty sure it’s not, like, NYT shareholders like Carlos Slim who are gonna make the call on it… But Metz instead speaks ex cathedra in the royal we, and signs off in an awful hurry after he says “once I gather all the information that I need, I will write a story” and Vassar starts asking pointed questions about that narrative and why it seems to presuppose doxing Scott while unable to point to some specific newsworthy point of his true name like “his dayjob turns out to be Grand Wizard of the Ku Klux Klan”.

(This interview is also a good example of the value of recordings. Think how useful this transcript is and how much less compelling some Vassar paraphrases of their conversation would be.)
What links here?
- DPiepgrass's comment on My Interview With Cade Metz on His Reporting About Slate Star Codex by Zack_M_Davis (27 Mar 2024 10:09 UTC; -2 points)

gwern 12 May 2022 16:55 UTC
LW: 147 AF: 40
AF
on: Deepmind’s Gato: Generalist Agent
The two major points I take away:
1. Scaling Just Works: as blase as we may now be at seeing ‘lines go straight’, I continue to be shocked in my gut that they do just keep going straight and something like Gato can be as straightforward as ‘just train a 1.2b-param Transformer on half a thousand different tasks, homes, nbd’ and it works exactly like you’d think and the scaling curve looks exactly like you’d expect. It is shocking how unshocking the results are conditional on a shocking thesis (the scaling hypothesis). So many S-curves and paradigms hit an exponential wall and explode, but DL/DRL still have not. We should keep this in mind that every time we have an opportunity to observe scaling explode in a giant fireball, and we don’t.
2. Multi-task learning is indeed just another blessing of scale: as they note, it used to be that learning multiple Atari games in parallel was really hard. It did not work, at all. You got negative transfer even within ALE. People thought very hard and ran lots of experiments to try to create things like Popart less than 4 years ago where it was a triumph that, due to careful engineering a single checkpoint could play just the ALE-57 games with mediocre performance.
  
  Decision Transformer definitely made ‘multi-task learning is a blessing of scale’ the default hypothesis, but no one had actually shown this, the past DT and other work (aside from MetaMimic) were all rather low n and k; you could wonder if they would interfere at a certain point or break down, and require fancy engineering like MoEs to enable learning at all. (Similarly, Andy Jones showed nice scaling laws for DRL and I scraped together a few examples like Ms Pacman, but nothing across really complicated tasks or many tasks.)
  
  Now you can throw in not just ALE, but DMLab, Metaworld, Procgen, hell, let’s just throw in a bunch of random Internet scraped text and images and captions and treat those as ‘reinforcement learning tasks’ too why not, and to make them all play together you do… nothing, really, you just train on them all simultaneously with a big model in about the dumbest way possible and it works fine.
(Also, if one had any doubts, DM is now fully scale-pilled.)
What links here?
- Gato as the Dawn of Early AGI by David Udell (15 May 2022 6:52 UTC; 85 points)
- Iterated Distillation-Amplification, Gato, and Proto-AGI [Re-Explained] by Gabe M (27 May 2022 5:42 UTC; 21 points)

gwern 24 Jul 2013 18:02 UTC
138 points
on: The Robots, AI, and Unemployment Anti-FAQ

The difficulty with supposing that automation is producing unemployment is that automation isn’t new, so how can you use it to explain this new phenomenon of increasing long-term unemployment?

Clearly computers are exactly the same, and ought to be expected to have the same effects, as steam engines. Just look at horses, they’re doing fine.

Now there’s been a recession and the jobs aren’t coming back (in the US and EU), even though NGDP has risen back to its previous level (at least in the US). If the problem is automation, and we didn’t experience any sudden leap in automation in 2008, then why can’t people get back at least the jobs they used to have, as they did in previous recessions? Something has gone wrong with the engine of reemployment...But this must mean something new and awful is happening to the processes of employment—it’s not because the kind of automation that’s happening today is different from automation in the 1990s, 1980s, 1920s, or 1870s; there were skilled jobs lost then, too. …even I can see all sorts of changed circumstances which are much more plausible sources of novel employment dysfunction than the relatively steady progress of automation.

And there are also issues like labor hoarding and sticky wages/ratchets and tipping points and technologies reaching break-evens. Let me describe another plausible argument: “since computers and software have increased their usefulness smoothly albeit exponentially, we would see productivity gradually increase over time due to computers/software, and computers/software as so great that this would be obvious to the dimmest person using the most gross aggregate figures”. This argument would be dead wrong, you would see essentially zero benefit from computers up to the ’90s, and this massively counterintuitive and unexpected fact is dubbed the productivity paradox.

You don’t even show that we didn’t see this sort of abrupt jump in disemployment back then! For all you know, during the various panics and busts, there were huge disemployment effects as companies were forced or enabled to automate, but the people were able to switch sectors or find new jobs, which is the principle claim here.

Or to be less extreme, there are lots of businesses who’d take nearly-free employees at various occupations, if those employees could be hired literally at minimum wage and legal liability wasn’t an issue.

Part of ZMP, as you should be aware, is that it’s perfectly possible to have lots of humans who you would not hire at any wages at all, completely aside from the issue that the much-ballyhooed disemployment effects of minimum wage have been surprisingly hard to observe. For example, how many people would hire a black kid from the inner city to do their dishes for $0 an hour? Not many. How many would do so if they learned that like distressingly many such people, the kid in question has been convicted of some crime or other? I am guessing less than 100% of people would hire them. This is an obvious case where you would not hire someone at any price; ZMP simply extends this to say that there are many more such people.

We do not literally have nothing better for unemployed workers to do. Our civilization is not that advanced.

Sure we are. One video of an employee spitting in customer’s food can go viral and do more damage to a chain’s sales than that employee would earn for the chain in a hundred years. One person in an o-ring process can do an incredible amount of damage if they are only slightly subpar; to continue the NASA analogy, one loose bolt can cost $135 million, one young inexperienced technician can cost $200 million. Isolated examples? Well, just calculate the expected-value of reducing the number of such incidents by even 0.01%...

Many industries that would otherwise be accessible to relatively less skilled labor, have much higher barriers to entry now than in 1950. Taxi medallions, governments saving us from the terror of unlicensed haircuts, fees and regulatory burdens associated with new businesses—all things that could’ve plausibly changed between now and the previous four centuries.

What happened to your smoothness argument? It applies just as well to your libertarian examples here—better, actually, because many of your examples have origins in the Great Depression, for example, NYC taxi medallions in 1937.

Human beings, including employers, are very averse to downside risk, so this could plausibly be a major obstacle to reemployment.

What’s sauce for the goose is sauce for the gander. Why doesn’t this apply to firing people as well and fully explain how automation could be smoothly progressing while disemployment cyclical?

We need some new factor to explain why this wasn’t true in 1950, and obvious candidates

No, the obvious candidate is the increasing skilledness and fragility of production as automation and precision and all-around technological sophistication increases. You want to know what manufacturing looks like in 2013, and not 1950? http://www.theatlantic.com/magazine/archive/2012/01/making-it-in-america/308844/?single_page=true is as good a place to start as any.

A. Then it’s odd to see so many news articles talking about AI killing jobs, when plain old non-AI computer programming and the Internet have affected many more jobs than that. The buyer ordering books over the Internet, the spreadsheet replacing the accountant—these processes are not strongly relying on the sort of algorithms that we would usually call ‘AI’ or ‘machine learning’ or ‘robotics’.

Those were AI. “AI is whatever we don’t know how to do yet”, remember? Look at the MIT AI Lab, and what it and other AI places were doing in the ’70s and ’80s due to and to support their work: intranets, Internet, hypertext, interpreted languages with garbage collection, GUIs, single-person workstations, parallel processing, online chat and email, networking algorithms and on and son.

And then there’s all the robotic warehouses which help online retailers like Amazon compete. Hm. I bet in a past era those warehouses would’ve been run using humans.

Even then, the total number of people driving cars for money would just be a small part of the total global economy; most humans are not paid to drive cars most of the time.

The trucking industry alone employs ~3% of the entire American population. That’s not trivial by any means. And how many of those employees do you think are skilled operations research PhDs who can easily find employment elsewhere in logistics?

If we imagine that in future decades machine intelligence is slowly going past the equivalent of IQ 70, 80, 90, eating up more and more jobs along the way...

Q. Could we already be in this substitution regime -

A. No, no, a dozen times no, for the dozen reasons already mentioned. That sentence in Hanson’s paper has nothing to do with what is going on right now.

Oh yeah? Alright, here’s a kid with IQ 70. He can lift things under 40 pounds and put them where you tell him to. I’m afraid he can’t read past a third-grade level, or anything like that. It’s probably not a good idea to let him near any moving machinery either. Fortunately for you, he doesn’t throw any violent temper tantrums and he doesn’t steal—he’s a sweet kid, willing to work. Just dumber than a stack of bricks. Take him down Main Street and see if anyone will hire him. How many job offers did he get?

More generally, Eliezer, you seem to completely fail to grapple with the real proponents of these ideas like Autor or Brynjolfsson or heck, even Cowen. What is the point of this ‘anti-FAQ’ if you aren’t dealing with the actual arguments (never mind steelmen)?
What links here?
- iceman's comment on Open thread, Jan. 25 - Jan. 31, 2016 by username2 (26 Jan 2016 0:52 UTC; 2 points)

gwern 24 Nov 2021 23:37 UTC
LW: 135 AF: 54
0
AF
in reply to: Matthew Barnett’s comment on: Yudkowsky and Christiano discuss “Takeoff Speeds”
The impact of GPT-3 had nothing whatsoever to do with its perplexity on Penn Treebank. I think this is a good example of why focusing on perplexity and ‘straight lines on graph go brr’ is so terrible, such cargo cult mystical thinking, and crippling. There’s something astonishing to see someone resort to explaining away GPT-3′s impact as ‘OpenAI was just good at marketing the results’. Said marketing consisted of: ‘dropping a paper on Arxiv’. Not even tweeting it! They didn’t even tweet the paper! (Forget an OA blog post, accompanying NYT/TR articles, tweets by everyone at OA, a fancy interactive interface—none of that.) And most of the initial reaction was “GPT-3: A Disappointing Paper”-style. If this is marketing genius, then it is truly 40-d chess, is all I can say.

The impact of GPT-3 was in establishing that trendlines did continue in a way that shocked pretty much everyone who’d written off ‘naive’ scaling strategies. Progress is made out of stacked sigmoids: if the next sigmoid doesn’t show up, progress doesn’t happen. Trends happen, until they stop. Trendlines are not caused by the laws of physics. You can dismiss AlphaGo by saying “oh, that just continues the trendline in ELO I just drew based on MCTS bots”, but the fact remains that MCTS progress had stagnated, and here we are in 2021, and pure MCTS approaches do not approach human champions, much less beat them. (This is also true of SVMs. Notice SVMs solving ImageNet because the trendlines continued? No, of course you did not. It drives me bonkers to see AI Impacts etc make arguments like “deep learning is unimportant because look, ImageNet follows a trendline”. Sheer numerology.) Appealing to trendlines is roughly as informative as “calories in calories out”; ‘the trend continued because the trend continued’. A new sigmoid being discovered is extremely important.

GPT-3 further showed completely unpredicted emergence of capabilities across downstream tasks which are not measured in PTB perplexity. There is nothing obvious about a PTB BPC of 0.80 that causes it to be useful where 0.90 is largely useless and 0.95 is a laughable toy. (OAers may have had faith in scaling, but they could not have told you in 2015 that interesting behavior would start at 𝒪(1b), and it’d get really cool at 𝒪(100b).) That’s why it’s such a useless metric. There’s only one thing that a PTB perplexity can tell you, under the pretraining paradigm: when you have reached human AGI level. (Which is useless for obvious reasons: much like saying that “if you hear the revolver click, the bullet wasn’t in that chamber and it was safe”. Surely true, but a bit late.) It tells you nothing about intermediate levels. I’m reminded of the Steven Kaas line:

Why idly theorize when you can JUST CHECK and find out the ACTUAL ANSWER to a superficially similar-sounding question SCIENTIFICALLY?

Using PBT, and talking only about perplexity, is a precise answer to the wrong question. (This is a much better argument when it comes to AlphaGo/ELO, because at least there, ‘ELO’ is in fact the ultimate objective, and not a proxy pretext. But perplexity is of no interest to anyone except an information theorist. Unfortunately, we lack any ‘take-over-the-world-ELO’ we can benchmark models on and extrapolate there. If we did and there was a smooth curve, I would indeed agree that we should adopt that as the baseline. But the closest things we have to downstream tasks are all wildly jumpy—even superimposing scores of downstream tasks barely gives you a recognizable smooth curve, and certainly nothing remotely as smooth as the perplexity curve. My belief is that this is because the overall perplexity curve comes from hundreds or thousands of stacked sigmoids and plateau/breakthroughs averaging out in terms of prediction improvements.) It sure would be convenient if the only number that mattered in AI or its real-world impact or risk was also the single easiest one to measure!

I emphasized this poverty of extrapolation in my scaling hypothesis writeup already, but permit me to vent a little more here:

“So, you’re forecasting AI progress using PTB perplexity/BPC. Cool, good work, nice notebook, surely this must be useful for forecasting on substantive AI safety/capability questions of interest to us. I see it’s a pretty straight line on a graph. OK, can you tell me at what BPC a large language model could do stuff like hack computers and escape onto the Internet?”

“No. I can tell you what happens if I draw the line out x units, though.”

“Perhaps that’s an unfairly specific question to ask, as important as it is. OK, can you tell me when we can expect to see well-known benchmarks like Winograd schemas be solved?”

“No. I can draw you a line on PTB to estimate when PTB is solved, though, if you give me a second and define a bound for ‘PTB is solved’.”

“Hm. Can you at least tell me when we can expect to see meta-learning emerge, with good few-shot learning—does the graph predict 0.1b, 1b, 10b, 100b, or what?”

“No idea.”

“Do you know what capabilities will be next to emerge? We got pretty good programming performance in Copilot at 𝒪(100b), what’s next?”

“I don’t know.”

“Can you qualitatively describe what we’d get at 1t, or 10t?”

“No, but I can draw the line in perplexity. It gets pretty low.”

“How about the existence of any increasing returns to scale in downstream tasks? Does it tell us anything about spikes in capabilities (such as we observe in many places, such as text style transfer and inner monologue in LaMDA at 100b; most recently BIG-bench)? Such as whether there are any more spikes past 𝒪(100b), whether we’ll see holdouts like causality suddenly fall at 𝒪(1000b), anything like that?”

“No.”

“How about RL: what sort of world modeling can we get by plugging them into DRL agents?”

“I don’t know.”

“Fine, let’s leave it at tool AIs doing text in text out. Can you tell me how much economic value will be driven by dropping another 0.01 BPC?”

“No. I can tell you how much it’d cost in GPU-time, though, by the awesome power of drawing lines!”

“OK, how about that: how low does it need to go to support a multi-billion dollar company running something like the OA API, to defray the next 0.01 drop and pay for the GPU-time to get more drops?”

“No idea.”

“How do you know BPC is the right metric to use?”

“Oh, we have lots of theories about it, but I’ll level with you: we always have theories for everything, but really, we chose BPC post hoc out of a few thousand metrics littering Arxiv like BLEU, ROUGE, SSA etc after seeing that it worked and better BPC = better models.”

“Can you write down your predictions about any of this?”

“Absolutely not.”

“Can anyone else?”

“Sure. But they’re all terribly busy.”

“Did you write down your predictions before now, then?”

“Oh gosh no, I wasn’t around then.”

“Did… someone… else… write down their predictions before?”

“Not that I’m aware of.”

“Ugh. Fine, what can you tell me about AI safety/risk/capabilities/economics/societal-disruption with these analyses of absolute loss?”

“Lines go straight?”

Seems to me that instead of gradualist narratives it would be preferable to say with Socrates that we are wise about scaling only in that we know we know little & about the least.
What links here?

gwern 15 Oct 2022 23:27 UTC
120 points
6
on: Analysis: US restricts GPU sales to China
It looks like this was only the start. The news from the past few days is alarming at face value: it sounds like a near total ban on everything, TSMC fabbing only the start (!), now all ASML customer support† is gone (!!) not just EUV machines, on top of which all the American-citizen employees (a nontrivial fraction due to education/birth abroad) have halted work literally overnight (!!!), and overall what sounds like the collapse of Chinese semiconductors amidst a Chinese economy already showing serious signs of distress and little capacity to keep an industry on life-support indefinitely as they scramble to survive—and once they go down in a cascade of bankruptcies/liquidations*, each node killing the nodes dependent on it, and everything is sold off and liquidated and employees scatter to the four winds, rebuilding that semiconductor ecosystem (which has already cost $100b+ and decades to build in repeated efforts) years later, when even further behind, will be more like starting from scratch than turning on an idle car. (And how long might that be, if China is entering its ‘lost decades’...?) What’s the exit plan here for the US, what does success look like? If there is one thing to learn from past attempts at strategic embargoes like China’s failed rare earth embargo or Russia’s failed gas extortion, it’s that while they can be extremely effective in the short term, they are one-shot weapons which neutralize themselves in the less short run.

Or to put it another way, if the CCP tries to invade Taiwan in the next 10 years, historians are likely going to point to 2-3 days ago as the pivot: it is now a razor blade cutting China’s throat if AI and high tech in general is the future, with nothing more to lose and everything to gain from destroying TSMC so no one else can have it either. (Meanwhile, the NYT front page story: “Companies Are Hoarding Workers. That Could Be Good News for the Economy.” Glad our media is still doing a bangup job of keeping everything in perspective.)

The implications for AI scaling and timelines are also immense: aside from the obvious disaster for Chinese AI research, which will stagnate at present Chinchilla-level runs, any invasion attempt is tantamount to the destruction of TSMC (you can now be sure some cruise missiles will find their way to TSMC chip fabs conveniently set out as hostages on the western side of Taiwan & most definitely not hardened underground in Taiwan’s mountains) and will set back scaling by years, possibly decades given the realities of chip fab, capital investment, globalized supply chains, tacit knowledge, experience curves, and risk premia leading to compounding falling behind exponential projections.

So uh… yikes? Is this really what’s happening? Is there some ameliorating factor not covered in any of the reporting so far? We should all probably be paying way more attention to this than Elon Musk’s latest bipolar antics, or even Putin’s nuclear blackmail, because this appears to actually be happening right now, not merely possibly some day. We appear to live in interesting times.

Links (in no particular order):
- https://www.scmp.com/tech/tech-war/article/3195785/tech-war-chinas-top-chip-equipment-maker-removes-us-employees-product (note: semi-independent; I have not seen any party media report on this, which given that we are in the runup to the party congress & Xi’s coronation‡ where extreme measures are being imposed & all bad news is verboten, suggests that the CCP regards this as rather bad news not to be mentioned at all, otherwise it would either be denying such hostile rumors or excoriating the USA & Dark Biden; Chinese censorship strategy is to silence & distract, so a total blackout from mainland media, rather than semi-independent HK media, is consistent with this being as apocalyptic as it sounds. EDIT: haven’t read any full translations of speeches yesterday, but coverage didn’t highlight any specific chip comments, other than generic Xi comments about the many threats to Chinese national security and ‘stormy waters’ ahead. Still no official CCP or CCP company comment on any of this, apparently, so the total blackout continues.)
  - “At least 43 senior executives working with 16 listed Chinese semiconductor companies hold roles from CEO to vice president”, WSJ (Twitter)
- https://www.bloomberg.com/news/articles/2022-10-12/asml-orders-us-employees-to-stop-servicing-customers-in-china
- https://www.bloomberg.com/news/articles/2022-10-12/us-chip-suppliers-pull-back-from-china-s-yangtze-after-biden-ban
- https://twitter.com/jordanschnyc/status/1580889342402129921 https://twitter.com/Scholars_Stage/status/1580950956560199683
- https://www.nytimes.com/2022/10/13/us/politics/biden-china-technology-semiconductors.html
- https://www.chinatalk.media/p/new-chip-export-controls-explained
- https://www.bloomberg.com/news/articles/2022-10-10/china-chip-stocks-drop-as-biden-tightens-rules-on-us-tech-access
- https://noahpinion.substack.com/p/biden-declares-economic-war-on-the
- “Apple freezes plan to use China’s YMTC chips amid political pressure: Company previously planned to put Chinese-made memory in some iPhones” (NYT on how YMTC’s Apple deal was a major target of Congress):
  
  Apple has put on hold plans to use memory chips from China’s Yangtze Memory Technologies Co. (YMTC) in its products, multiple sources told Nikkei Asia...Apple originally planned to start using the government-funded YMTC’s chips as early as this year, as they are at least 20% cheaper than those of its leading rivals, supply chain executives said...Apple had already completed the months-long process to certify YMTC’s 128-layer 3D NAND flash memory for use in iPhones...Apple originally planned to start using the government-funded YMTC’s chips as early as this year, as they are at least 20% cheaper than those of its leading rivals, supply chain executives said...YMTC chips were initially planned to be used only for iPhones sold in the Chinese market. One source, however, said Apple was considering eventually purchasing up to 40% of the NAND flash memory needed for all iPhones from YMTC.
  
  ...YMTC’s 128-layer memory chips fall under the scope of these rules. This means it might no longer have the technological capacity to produce enough quality and quantity of chips for Apple, even if the iPhone maker wanted to source from the company, analysts said. “Apple may continue wanting to use YMTC in the local market for China. But the way the regulations are set up currently, it’s very unlikely that YMTC will even be able to supply the kind of NAND chips in a couple of years that Apple would want,” said Brent Fredberg, director of investments at Brandes Investment Partners in San Diego.
  
  ...Founded in 2016, YMTC is currently ramping up production at its second chip plant, which is slated to begin mass production this year. The potential deal with Apple was viewed by the industry as a massive victory for China’s semiconductor segment, as it would prove its ability to provide quality products for top global brands. YMTC has been stepping up efforts to reduce dependence on American chipmaking equipment and components since 2020, following Washington’s crackdown on Chinese tech champion Huawei Technologies. However, replacing U.S. suppliers is not easy given their stranglehold on key areas like chipmaking tools, and YMTC’s plan to ramp up production still relies heavily on their support.
  
  Apple and YMTC did not respond to requests for comment.
  
  Up to half of all iPhone memory worldwide! That was a huge order they lost, literally billions upon billions. And that’s just one order. And further, it was the one they were counting on to bankroll their chip fab—like a shark, massive capital investments drawing on a huge international web of dependencies don’t do well when suddenly frozen in place...
- “Lam Research warns of up to $2.5b revenue hit from U.S. curbs on China exports”
- https://www.csis.org/analysis/choking-chinas-access-future-ai
- “American technology boosts China’s hypersonic missile program”
- “Analysis: China faces its “Sputnik” moment as US export curbs deal a blow to its chip ambitions”, Reuters
- “China and USA Are Officially At Economic War – Technology Restriction Overview: New regulations will impact global trade by hundreds of billions of dollars annually”, Dylan Patel (SemiAnalysis)
- “China Summons Chip Firms for Emergency Talks After US Curbs”, Bloomberg (emphasis added):
  
  China’s top technology overseer convened a series of emergency meetings over the past week with leading semiconductor companies, seeking to assess the damage from the Biden administration’s sweeping chip restrictions and pledging support for the critical sector. The Ministry of Industry and Information Technology has summoned executives from firms including Yangtze Memory Technologies Co. and supercomputer specialist Dawning Information Industry Co. into closed-door meetings since Washington unveiled measures to contain China’s technological ambitions.
  
  MIIT officials appeared uncertain about the way forward and at times appeared to have as many questions as answers for the chipmakers, people familiar with the discussions said. While they refrained from hinting about counter-measures, officials stressed the domestic IT market would provide sufficient demand for affected companies to keep operating, the people said, asking to remain anonymous on a sensitive issue.
  
  Many of the participants argued US curbs collectively spell doom for their industry, as well as China’s ambitions to un-tether its economy from American technology. Yangtze Memory, among China’s best hopes of getting into cutting-edge chipmaking, warned the MIIT its future may be in jeopardy, according to one of the people.
- “TSMC Suspends Work for Chinese Chip Startup Amid US Curbs”; “China’s Largest GPU Developer Biren Slashes Headcount [by 33%] Due to U.S. Sanctions...as TSMC halts shipments” (particularly striking as this is coming well after the big meeting with the ministry and presumably Biren would know about any planned subsidies/relief, and this is only the start… it seems doubtful there are any competitors who will be hiring them, rather than firing their own, under the circumstances—will there soon be a lot of Shanghai taxi drivers peculiarly knowledgeable about CUDA programming?)
- “China’s Xi Says Willing to Work With United States for Mutual Benefit”; “Top China Envoy Lashes Out at US Export Curbs in Blinken Call”
- “Shares in Chinese Companies Crash After Xi Jinping Stacks Party With Allies” (tech especially)
- “TSMC Cuts Down Orders By Up to 50%, Sending Shockwaves Says Report”
- “US Ban on Americans Aiding China Chip Firms Narrower Than Feared”
- “Will Sanctions Against Russia End the War in Ukraine? D.C. bureaucrats have worked stealthily with allies to open a financial front against Putin.” (not directly China-related but contains several examples of how chip sanctions gradually bite over time: it doesn’t make headlines, factories just go idle and then quietly shut down)
- “Observers: China’s Chip Talent Hurdle Worsens After Layoffs at US Firm Marvell”
- “US-China Chip War with the Chip Avengers: “It takes a thousand steps to make a semiconductor and you’re going to have to get them all right.”″, Jordan Schneider & Irene Zhang (Schneider denies that the party congress timing was deliberate, incidentally; YMMV on whether that makes the ‘giant middle finger to the Congress’ better or worse.)
- “AliBaba Chinese 128-Core CPU World Record Expelled From Rankings: No Availability; Biden’s sanctions work?”
- “Tech war: Nvidia offers new GPU chip [‘A800’] tailored for Chinese market as it vows to comply with US export regulations” (Reuters; gimped to 400GB/s interconnect, below the A100 600GB/s)
- “US seeks to pressure allies’ chip gear makers to join export control”
- “Chinese chip designers [AliBaba & Biren] slow down processors to dodge US sanctions” (specifically, gimping interconnect in the hopes that TSMC will agree to fab them)
- “TSMC 7nm process capacity utilization falling rapidly”
- “China’s top chip maker SMIC warns on negative impact from US export controls after posting flat third-quarter revenue”
- “China’s chip executives brace for winter as US sanctions push country’s semiconductor industry to the brink of desperation”, SMCP (notable points: unusually strong language in the title; VC investment has halted; orders have fallen substantially, such as −20%; chip business failures already exceed 2021′s total; SMIC has halved investments; one VC thinks startups will need at least 18 months of cashflow in the bank to survive; there’s some hype/happy-talk about how maybe they can just pivot to photonics out of semiconductor entirely; and there is zero discussion of CCP government bailout/support.)
- “A Dangerous Game Over Taiwan”
- “Engineers From Taiwan Bolstered China’s Chip Industry. Now They’re Leaving.”
- “US bans may not be loosened for next 5 years, which benefits Korean memory brands, says Silicon Motion”
- See also: for historical background on China’s previous failed efforts to create a autarkic domestic chip industry, see ‘The Sour Past of “China Chips”’
- “China’s Silicon Future: China dreams of competing with global superpowers in the semiconductor industry. Whether its efforts will succeed is far from clear.”, Karson Elmgren (CSET)
- “Beijing allows US export-control checks on Chinese tech companies: Biden administration says China’s commerce ministry has allowed inspections ahead of trade blacklisting deadline” (“Desperate times calls for desperate measures.”)
  
  ...Estevez said the Chinese commerce ministry — which since the Trump administration has refused to allow US officials to conduct end-use checks to ensure that American technology was not being diverted for unauthorised activity such as the manufacture of weapons — had become more receptive since Washington imposed the controls in October.
  
  ...Beijing approved visits of US officials to companies in Wuhan, Shanghai and several cities in Guangdong province in November, according to four government officials with direct knowledge of the matter. The decision came after the semiconductor industry and local authorities filed a series of petitions on the sweeping impact of the latest export controls. “It is the industry’s unanimous response to the escalating US ban that has Beijing beginning to waver on whether it should continue to escalate its confrontation with the US over semiconductors,” said a government official in the tech hub of Shenzhen who was familiar with the matter. “Against such a sluggish macroeconomic backdrop, if geopolitical influences continue to penetrate, it does not benefit the Chinese semiconductor chain.”
  
  Looks like they’re knuckling under.
- “Japanese tech leaders warn Beijing will ride out US chip sanctions: Sony and NEC executives say progress may be slowed but China will remain a force in AI and other areas”
- “US targets China’s potential chip stars with new restrictions: Companies added to trade blacklist previously flew under the radar”:
  
  In China’s southern tech hub of Shenzhen, employees at chipmaking start-up PXW Semiconductor Manufactory began to panic after the US put their company on a trade blacklist last week. “Most team leaders and executives are in emergency meetings, but the rest of us are not allowed to discuss such a ‘sensitive’ matter,” an employee said, adding that their boss’s office door remained closed on Friday, one day after the US added PXW to the “entity list” along with 35 other Chinese companies.
  
  ...Some of the companies targeted last week, including PXW, are only just starting to develop their semiconductor business and thus more vulnerable than established players such as Huawei. “The US government has mastered the Chinese semiconductor supply chain and knows who the priorities are and who are with future potential,” said Brady Wang, a Taiwan-based analyst at research firm Counterpoint...“The US is developing an increasingly detailed understanding of the industry in China, including players you would have considered as obscure,” the official said.
  
  PXW has strong support, including funding from the Shenzhen government and the leadership of a former Huawei executive. The company has ordered equipment from various US companies scheduled to arrive next year, but it might now never receive it, according to two company employees.
  
  Another unexpected addition to the list is Hefei Core Storage Electronic, a company founded by former staff of Taiwanese chip design company VIA Technologies to develop a homegrown alternative to Intel-based PC processors. “It is a bad surprise,” said a Hefei Core Storage engineer. “Nobody expected that we would be on their radar.”
  
  ...Yangtze Memory Technologies, China’s largest memory chip maker, was already hit hard by the October controls. The company had halted its expansion and asked US equipment manufacturers to return down payments for previously ordered tools, said a senior engineer at YMTC. “At that time, we could still consider retreating to [making less advanced] chips, but now our fate is all but sealed,” he said, referring to the near impossibility of getting licences approved for equipment to expand production after being put on the entity list. YMTC had already suspended talks with Apple on supplying memory chips for iPhones in China. Research company TrendForce predicts it could be forced to exit the market for advanced 3D Nand flash products by 2024 as it has lost critical support from toolmakers to compete with rivals on this particular memory technology.
  
  ...Washington also included a prominent developer of chipmaking equipment: SMEE (Shanghai Micro Electronics Equipment), which represents China’s only hope of developing homegrown lithography machines, the critical advanced chipmaking tool currently dominated by Dutch company ASML. The company’s lithography machines rely on imported components and have never run in mass production. “There is still a long way to go,” said a Shanghai official who handled SMEE’s development project. But the official pointed out that the company had formed teams of experienced staff to replace ASML field workers who were providing services but later withdrawn due to US export controls. “SMEE doesn’t have personnel who are US persons like some other Chinese chip equipment makers,” Fuller said. “Therefore the controls on US persons included in the October measures are less effective.”
(BGI Genomics also got blacklisted, not that anyone particularly cares about Chinese genetics at this point.)

* One might think it’d be crazy to try to trigger this sort of systemic crisis in the middle of a global, and Chinese, economic crisis. But of course, like chemotherapy, the question isn’t whether it’s bad for you, but worse for the other guy, and potentially pushing the chip ecosystem into a systemic collapse will never be easier than it is today. The USA is much wealthier than China, it can handle high-priced chips better.
† Presumably this extends to updates, upgrades, replacement of consumables, repairs when things break, replacement of broken machines… (How’s Russian military manufacturing & aviation going these days?)
‡ What an utter insult to Xi Jingping, incidentally. He must be furious to have this drop literally days before. Brother Pooh is not noted for his thick skin. I wouldn’t be surprised if whoever is masterminding this in the Biden administration timed it deliberately—after all, given that it’s been several weeks since the TSMC GPU embargo was announced, they could easily have delayed it to this week & after.
What links here?

gwern 24 Nov 2022 2:30 UTC
114 points
7
in reply to: gwern’s comment on: Analysis: US restricts GPU sales to China
So, it’s been about a month and a week. Seems like enough time to evaluate a little the link dump above (which is in semi-chronological order). Where is this chip embargo now? I’d summarize the news & expert opinion thus far as:
1. The embargo is still solid.
  
  The Biden administration has not walked it back, and confirmed the more restrictive parts. I have not seen any coverage indicating that Chinese corps are trivially circumventing it, both ASML & TSMC seem to be enforcing it, and Chinese corps are biting the bullet in deliberately gimping their chip designs to comply with it. Further, major players like Apple have been canceling equally major orders. None of this would be happening if it were only on paper or could be easily circumvented with a shell corp or something.
2. The consequences for the Chinese chip industry are still big, and bad.
  
  We have plenty of reports about major layoffs, large hits to revenue, cutbacks to investments/R&D, exodus of US/Taiwan-linked personnel, and a halt to VC investment. Quotes from insiders like VCs or major chip manufacturer representatives range from ‘dire’ to ‘apocalyptic’, with time-ranges in the years for when—hopefully!---things might be better again. (Much less informatively: My previous comment got some circulation on social media, and mockery aside, there weren’t any comments I saw that looked like good arguments for why the impact would be minimal, rather than vague assertions that they would just somehow be fine even if they couldn’t get any ASML gear etc. I also asked anyone who might know something on my recent SF trip why this might not be a big deal, and got nothing, and overall an impression that everyone has been too distracted by the numerous other things happening of late to really process it.)
  
  How big? It’s unclear because these aren’t great times for the global semiconductor industry either, as they are running into the general economic malaise and the bullwhip effect where the excess COVID-induced demand & past supply shortages are fading out—but it’s worth noting that it seems like non-CCP firms like Nvidia & TSMC were well-aware that they were going to overshoot to some degree and prepared for it, and don’t seem to be in nearly as bad shape overall.
3. But there does not (!) seem to be any massive CCP bailout of the Chinese chip industry planned.
  
  While there are many fiscal stimuli ongoing, including ones announced since the embargo, chip-specific ones have not been announced—as would be necessary both to coordinate and restore confidence in the ecosystem—and the reported layoffs/cutbacks are highly costly mistakes if you expect a big faucet of billions of dollars of free government money to be turned on any month now, so seem to imply that the post-embargo meetings with the government did not spur a bailout effort. If Bloomberg’s reporting is correct (and I have no problem believing that a meeting attended by that many figures had at least one person willing to recount it all near-verbatim to a Bloomberg journalist), then they already collectively told the CCP that they were ‘doomed’ without massive additional investment, and the CCP appears to’ve shrugged and told them that ‘domestic IT market demand was adequate’ for them to survive.
4. The prospect of a collapse, beyond merely a hard recession, remains unclear, and will be hard to evaluate.
  
  It may be tempting to say that “well, it’s been a month and while they’ve had some painful blows they are clearly still fine”. But that’s never how systemic collapses happen—remember bubbles like the Japan bubble, the US housing bubble, fracking, the repeatedly-averted Chinese bubble popping etc, or consider cryptocurrencies right now: the industry seemed to have weathered its most recent bubble popping with surprisingly few casualties compared to the historical fallout of each cryptocurrency bubble, in part due to bailout purchases/investments by sterling household brands like FTX… and then literally overnight that fell apart, and many individuals at many entities received unwelcome surprises about what connections there were in the cryptocurrency ecosystem. “There is a great deal of ruin in a nation.”
  
  Almost all entities involved still have runway: I mean, if you were so fragile that you had less than 1 month of expenses (in the worst case of abruptly going to zero cashflow) and could have gone bankrupt already, then you were already doomed, embargo or no embargo. It is just very little time, on an industry-wide scale. Zombie companies can stagger on for a long time before finally going bankrupt. (As the quote goes: “slowly, then suddenly.”) Cash has not run out, reality has not set in, optimism remains high, supply stockpiles are only partially depleted, complex machines have not yet broken down or reached the end of maintenance cycles or expected lifetimes, slashed orders mean capacity losses are less important… Similarly, when Putin invaded Ukraine 275 days ago, they were extensively embargoed, particularly on chips, and there have not been any dramatic consequences with screaming headlines about passenger planes falling out of the sky—but there have been consequences, such as the gradual disappearance of all their good high precision missiles/artillery like cruise missiles, the choking off supplies to their front lines, resorting to cannibalizing many units to get one working unit or using very expensive equipment in incredibly wasteful ways (hypersonic missiles on apartment blocks, anti-ship missiles on land targets, S-300 AAs as ghetto cruise missiles), obsolete equipment & reliance on imports like Iranian drones, and factories you’ve never heard of simply dying for lack of chips & other supplies because they were unable to work around so many dependencies lopped off all at once. It has taken many months for subtle signs to show up of real damage, and indeed, many people early on were quite skeptical any Russian embargoes could do any good (surely they will just import it from China---!).
Points #1/2/4 are no surprise but point #3 is a surprise. I think pretty much everyone took for granted that the basic CCP response would be to double down on all the chip subsidies, and if they’d already dumped in $100b, oh well, now they’d dump in $200b (in for a penny, in for a pound). This seems… not to have happened? That’s surprising. At least, I’m surprised. If you aren’t surprised, why not? Is there a bailout somewhere I missed in the news?
What links here?
- Kaj_Sotala's comment on Let’s think about slowing down AI by KatjaGrace (23 Dec 2022 19:24 UTC; 40 points)

gwern 23 Oct 2014 2:35 UTC
112 points
on: 2014 Less Wrong Census/Survey
Done. Too bad the basilisk question wasn’t on it; I hope that will one day be possible.

gwern 20 Feb 2023 18:34 UTC
111 points
19
in reply to: gwern’s comment on: Bing Chat is blatantly, aggressively misaligned
I’ve been told [by an anon] that it’s not GPT-4 and that one Mikhail Parakhin (ex-Yandex CTO, Linkedin) is not just on the Bing team, but was the person at MS responsible for rushing the deployment, and he has been tweeting extensively about updates/fixes/problems with Bing/Sydney (although no one has noticed, judging by the view counts). Some particularly relevant tweets:

On what went wrong:

This angle of attack was a genuine surprise—Sydney was running in several markets for a year with no one complaining (literally zero negative feedback of this type). We were focusing on accuracy, RAI issues, security.

[Q. “That’s a surprise, which markets?”]

Mostly India and Indonesia. I shared a couple of old links yesterday—interesting to see the discussions.

[Q. “Wow! Am I right in assuming what was launched recently is qualitatively different than what was launched 2 years ago? Or is the pretty much the same model etc?”]

It was all gradual iterations. The first one was based on the Turing-Megatron model (sorry, I tend to put Turing first in that pair :-)), the current one—on the best model OpenAI has produced to date.

[Q. “What modifications are there compared to publicly available GPT models? (ChatGPT, text-davinci-3)”]

Quite a bit. Maybe we will do a blogpost on Prometheus specifically (the model that powers Bing Chat) - it has to understand internal syntax of Search and how to use it, fallback on the cheaper model as much as possible to save capacity, etc.

(‘”No one could have predicted these problems”, says man cloning ChatGPT, after several months of hard work to ignore all the ChatGPT hacks, exploits, dedicated subreddits, & attackers as well as the Sydney behaviors reported by his own pilot users.’ Trialing it in the third world with some unsophisticated users seems… uh, rather different from piloting it on sophisticated prompt hackers like Riley Goodside in the USA and subreddits out to break it. :thinking_face: And if it’s not GPT-4, then how was it the ‘best to date’?)

They are relying heavily on temperature-like sampling for safety, apparently:

A surprising thing we discovered: apparently we can make New Bing very interesting and creative or very grounded and not prone to the flights of fancy, but it is super-hard to get both. A new dichotomy not widely discussed yet. Looking for balance!

[Q. “let the user choose temperature ?”]

Not the temperature, exactly, but yes, that’s the control we are adding literally now. Will see in a few days.

...This is what I tried to explain previously: hallucinations = creativity. It tries to produce the highest probability continuation of the string using all the data at its disposal. Very often it is correct. Sometimes people have never produced continuations like this.

You can clamp down on hallucinations—and it is super-boring. Answers “I don’t know” all the time or only reads what is there in the Search results (also sometimes incorrect). What is missing is the tone of voice: it shouldn’t sound so confident in those situations.

Temperature sampling as a safety measure is the sort of dumb thing you do when you aren’t using RLHF. I also take a very recent Tweet (2023-03-01) as confirming both that they are using fine-tuned models and also that they may not have been using RLHF at all up until recently:

Now almost everyone − 90% - should be seeing the Bing Chat Mode selector (the tri-toggle). I definitely prefer Creative, but Precise is also interesting—it’s much more factual. See which one you like. The 10% who are still in the control group should start seeing it today.

For those of us with a deeper understanding of LLMs what exactly is the switch changing? You already said it’s not temperature…is it prompt? If so, in what way?

Multiple changes, including differently fine-tuned and RLHFed models, different prompts, etc.

(Obviously, saying that you use ‘differently fine-tuned and RLHFed models’, plural, in describing your big update changing behavior quite a bit, implies that you have solely-finetuned models and that you weren’t necessarily using RLHF before at all, because otherwise, why would you phrase it that way to refer to separate finetuned & RLHFed models or highlight that as the big change responsible for the big changes? This has also been more than enough time for OA to ship fixed models to MS.)

He dismisses any issues as distracting “loopholes”, and appears to have a 1990s-era ‘patch mindset’ (ignoring that that attitude to security almost destroyed Microsoft and they have spent a good chunk of the last 2 decades digging themselves out of their many holes, which is why your Windows box is no longer rooted within literally minutes of being connected to the Internet):

[photo of prompt hack] Legit?

Not anymore :-) Teenager in me: “Wow, that is cool the way they are hacking it”. Manager in me: “For the love of..., we will never get improvements out if the team is distracted with closing loopholes like this”.

He also seems to be ignoring the infosec research happening live: https://www.jailbreakchat.com/ https://greshake.github.io/ https://arxiv.org/abs/2302.12173 https://www.reddit.com/r/MachineLearning/comments/117yw1w/d_maybe_a_new_prompt_injection_method_against/

Sydney insulting/arguing with user

I apologize about it. The reason is, we have an issue with the long, multi-turn conversation: there is a tagging issue and it is confused who said what. As a result, it is just continuing the pattern: someone is arguing, most likely it will continue. Refresh will solve it.

...One vector of attack we missed initially was: write super-rude or strange statements, keep going for multiple turns, confuse the model about who said what and it starts predicting what user would say next instead of replying. Voila :-(

On the DAgger problem:

Microsoft just made it so you have to restart your conversations with Bing after 10-15 messages to prevent it from getting weird. A fix of a sort.

Not the best way—just the fastest. The drift at long conversations is something we only uncovered recently—majority of usage is 1-3 turns. That’s why it is important to iterate together with real users, not in the lab!

Poll on turn count obviously turns in >90% in favor of longer polls:

Trying to find the right balance with Bing constraints. Currently each session can be up to 5 turns and max 50 requests a day. Should we change [5-50:L 0% 6-60: 6.3%; we want more: 94%; n = 64]

Ok, it’s only 64 people, but the sentiment is pretty clear. ⁶⁄₆₀ we probably can do soon, tradeoff for longer sessions is: we would need to have another model call to detect topic changes = more capacity = longer wait on waitlist for people.

[“Topics such as travel, shopping, etc. may require more follow up questions. Factual questions will not unless anyone is researching on a topic and need context. It can be made topic/context dependent. Maybe if possible let new Bing decide if that’s enough of the questions.”]

The main problem is people switching topics and model being confused, trying to continue previous conversation. It can be set up to understand the change in narrative, but that is an additional call, trying to resolve capacity issues.

Can you patch your way to security?

Instead of connection, people just want to break things, tells more about our nature than AIs. Short-term mitigation, will relax once jailbreak protection is hardened.

[“So next step is an AI chat that breaks itself.”]

That’s exactly how it’s done! We set it up to break itself, find issues and mitigate. But it is not as creative as users, it also is very nice by default, no replacement for the real people interacting with Bing.

The supposed leaked prompts are (like I said) fake:

Andrej, of all the people, you know that the real prompt would have some few-shots.

(He’s right, of course, and in retrospect this is something that had been bugging me about the leaks: the prompt is supposedly all of these pages of endless instructions, spending context window tokens like a drunken sailor, and it doesn’t even include some few-shot examples? Few-shots shouldn’t be too necessary if you had done any kind of finetuning, but if you have that big a context, you might as well firm it up with some, and this would be a good stopgap solution for any problems that pop up in between finetuning iterations.)

Confirms a fairly impressive D&D hallucination:

It cannot execute JavaScript and doesn’t interact with websites. So, it looked at the content of that generator, but the claim that it used it is not correct :-(

He is dismissive of ChatGPT’s importance:

I think better embedding generation is a much more important development than ChatGPT (for most tasks it makes no sense to add the noisy embedding->human language-> embedding transformation). But it is far less intuitive for the average user.

A bunch of his defensive responses to screenshots of alarming chats can be summarized as “Britney did nothing wrong” ie. “the user just got what they prompted for, what’s the big deal”, so I won’t bother linking/excerpting them.

They have limited ability to undo mode collapse or otherwise retrain the model, apparently:

I’ve got a rather funny bug for you this time—Bing experiences severe mode collapse when asked to tell a joke (despite this being one of the default autocomplete suggestions given before typing): …

Yeah, both of those are not bugs per se. These are problems the model has, we can’t fix them quickly, only thorough gradual improvement of the model itself.

It is a text model with no multimodal ability (assuming the rumors about GPT-4 being multimodal are correct, then this would be evidence against Prometheus being a smaller GPT-4 model, although it could also just be that they have disabled image tokens):

Correct, currently the model is not able to look at images.

And he is ambitious and eager to win marketshare from Google for Bing:

[about using Bing] First they ignore you, then they laugh at you, then they fight you, then you win.

Yes! I think marketers should consider Bing for their marketing mix in 2023. It could be a radically better outlet for ROAS.

Thank you, Summer! I think we already are—compare our and Google’s revenue growth rates in the last few quarters: once the going gets tougher, advertisers pay more attention to ROI—and that’s where Bing/MSAN shine.

But who could blame him? He doesn’t have to live in Russia when he can work for MS, and Bing seems to be treating him well:

Yeah, I am a Tesla fan (have both Model S and a new Model X), but unfortunately selfdriving simply is not working. Comically, I would regularly have the collision alarm going off while on Autopilot!

Anyway, like Byrd, I would emphasize here the complete absence of any reference to RL or any intellectual influence of DRL or AI safety in general, and an attitude that it’s nbd and he can just deploy & patch & heuristic his way to an acceptable Sydney as if it were any ol’ piece of software. (An approach which works great with past software, is probably true of Prometheus/Sydney, and was definitely true of the past AI he has the most experience with, like Turing-Megatron which is quite dumb by contemporary standards—but is just putting one’s head in the sand about why Sydney is an interesting/alarming case study about future AI.)
What links here?

gwern 2 May 2022 23:46 UTC
110 points
on: What DALL-E 2 can and cannot do
Swimmer963 highlights DALL-E 2 struggling with anime, realistic faces, text in images, multiple characters/objects arranged in complex ways, and editing. (Of course, many of these are still extremely good by the standards of just months ago, and the glass is definitely more than half full.) itsnotatumor asks:

How many of these “cannot do’s” will be solved by throwing more compute and training data at the problem? Anyone know if we’ve started hitting diminishing returns with this stuff yet?

In general, we have not topped out on pretty much any scaling curve. Whether it’s language modeling, image generation, DRL, or whathaveyou, AFAIK, not a single modality can be truly said to have been ‘solved’ with the scaling curve broken. Either the scaling curve is flat, or we’re still far away. (There are some sound-related ones which seem to be close, but nothing all that important.) Diffusion models’ only scaling law I know of is an older one which bends a little but probably reflects poor hyperparameters, and no one has tried eg. Chinchilla on them yet.

So yes, we definitely can just make all the compute-budgets 10x larger without wasting it.

To go through the specific issues (caveat: we don’t know if Make-A-Scene solves any of these because no one can use it; and I have not read the Cogview2 paper*):
- anime & realistic faces are purely self-imposed problems by OA. DALL-E 2 will do them fine just as soon as OA wants it to, and other models by other orgs do just fine on those domains. So no real problem there.
- text in images: this is an odd one. This is especially odd because it destroys the commercial application of any image with text in it (because it’s garbage—who’d pay for these?), and if you go back to DALL-E 1, one of the demos was it putting text into images like onto generated teapots or storefronts. It was imperfect, but DALL-E 2 is way worse at it, it looks like. I mean, DALL-E 1 would’ve at least spelled ‘Avengers’ correctly. Nostalgebraist has also shown you can get excellent text generation with a specialized small model, and people using compviz (also much smaller than DALL-E 2) get good text results. So text in images is not intrinsically hard, this is a DALL-E 2-specific problem, whatever it is.
  
  Why? As Nostalgebraist discusses at length in his earlier post, the unCLIP approach to using GLIDE to create the DALL-E 2 system seems to have a lot of weird drawbacks and tradeoffs. Just as CLIP’s contrastive view of the world (rather than discriminative or generative) leads to strange artifacts like images tessellating a pattern, unCLIP seems to cripple DALL-E 2 in some ways like compositionality is worsened. I don’t really get the unCLIP approach so I’m not completely sure why it’d screw up text. The paper speculates that
  
  it is possible that the CLIP embedding does not precisely encode spelling information of rendered text. This issue is likely made worse because the BPE encoding we use obscures the spelling of the words in a caption from the model, so the model needs to have independently seen each token written out in the training images in order to learn to render it.
  
  Damn you BPEs! Is there nothing you won’t blight?!
  
  It may also be partially a dataset issue: OA’s licensing of commercial datasets may have greatly underemphasized images which have text in them, which tends to be more of a dirty Internet or downstream user thing to have. If it’s unCLIP, raw GLIDE should be able to do text much better. If it’s the training data, it probably won’t be much different.
  
  If it’s the training data, it’ll be easy to fix if OA wants to fix it (like anime/faces); OA can find text-heavy datasets, or simply synthesize the necessary data by splatting random text in random fonts on top of random images & training etc. If it’s unCLIP, it can be hacked around by letting the users bypass unCLIP to use raw GLIDE, which as far as I know, they have no ability to do at the moment. (Seems like a very reasonable option to offer, if only for other purposes like research.) A longer-term solution would be to figure out a better unCLIP which avoids these contrastive pathologies, and a longer-term still solution would be to simply scale up enough that you no longer need this weird unCLIP thing to get diverse but high-quality samples, the base models are just good enough.
  
  So this might be relatively easy to fix, or have an obvious fix but won’t be implemented for a long time.
- complex scenes: this one is easy—unCLIP is screwing things up.
  
  The problem with these samples generally doesn’t seem to be that the objects rendered are rendered badly by GLIDE or the upscalers, the problem seems to be that the objects are just organized wrong because the DALL-E 2 system as a whole didn’t understand the text input—that is, CLIP gave GLIDE the wrong blueprint and that is irreversible. And we know that GLIDE can do these things better because the paper shows us how much better one pair of prompts are (no extensive or quantitative evaluation, however):
  
  In Figure 14, we find that unCLIP struggles more than GLIDE with a prompt where it must bind two separate objects (cubes) to two separate attributes (colors). We hypothesize that this occurs because the CLIP embedding itself does not explicitly bind attributes to objects, and find that reconstructions from the decoder often mix up attributes and objects, as shown in Figure 15.
  
  And it’s pretty obvious that it almost has to screw up like this if you want to take the approach of a contrastively-learned fixed-size embedding (Nostalgebraist again): a fixed-size embedding is going to struggle if you want to stack on arbitrarily many details, especially without any recurrency or iteration (like DALL-E 1 had in being a Transformer on text inputs + VAE-token outputs). And a contrastive model like CLIP isn’t going to do relationships or scenes as well as it does other things because it just doesn’t encounter all that many pairs of images where the objects are all the same but their relationship is different as specified by the text caption, which is the sort of data which would force it to learn how “the red box is on top of the blue box” looks different from “the blue box is on top of the red box”.
  
  Like before, just offering GLIDE as an option would fix a lot of the problems here. unCLIP screws up your complex scene? Do it in GLIDE. The GLIDE is hard to guide or lower-quality? Maybe seed it in GLIDE and then jazz it up in the full DALL-E 2.
  
  Longer-term, a better text encoder would go a long way to resolving all sorts of problems. Just existing text models would be enough, no need for hypothetical new archs. People are accusing DALL-E 2 of lacking good causal understanding or not being able to solve problems of various sorts; fine, but CLIP is a very bad way to understand language, being a very small text encoder (base model: 0.06b) trained contrastively from scratch on short image captions rather than initialized from a real autoregressive language model. (Remember, OA started the CLIP research with autoregressive generation, Figure 2 in the CLIP paper, it just found that more expensive, not worse, and switched to CLIP.) A real language model, like Chinchilla-80b, would do much better when fused to an image model, like in Flamingo.
So, these DALL-E 2 problems all look soluble to me by pursuing just known techniques. They stem from either deliberate choices, removing the user’s ability to choose a different tradeoff, or lack of simple-minded scaling.

* On skimming, CogView2 looks like it’d avoid most of the DALL-E 2 pathologies, but looks like it’s noticeably lower-quality in addition to lower-resolution.

EDIT: between Imagen, Parti, DALL-E 3, and the miracle-of-spelling paper, I think that my claims that text in images is simply a matter of scale, and that tokenization screws up text in images, are now fairly consensus in DL as of late 2023.
What links here?
- gwern's comment on Can ChatGPT count? by p.b. (12 Mar 2023 23:50 UTC; 2 points)

gwern 30 Jan 2023 1:57 UTC
LW: 108 AF: 35
49
AF
on: By Default, GPTs Think In Plain Sight

RLHF could make GPTs’ thoughts hard to decipher

After watching how people use ChatGPT, and ChatGPT’s weaknesses due to not using inner-monologue, I think I can be more concrete than pointing to non-robust features & CycleGAN (or the S1 ‘blob’) about why you should expect RLHF to put pressure towards developing steganographic encoding as a way to bring idle compute to bear on maximizing its reward. And further, this represents a tragedy of the commons where anyone failing to suppress steganographic encoding may screw it up for everyone else.

When people ask GPT-3 a hard multi-step question, it will usually answer immediately. This is because GPT-3 is trained on natural text, where usually a hard multi-step question is followed immediately by an answer; the most likely next token after ‘Question?’ is ‘Answer.‘, it is not ‘[several paragraphs of tedious explicit reasoning]’. So it is doing a good job of imitating likely real text.

Unfortunately, its predicted answer will often be wrong. This is because GPT-3 has no memory or scratchpad beyond the text context input, and it must do all the thinking inside one forward pass, but one forward pass is not enough thinking to handle a brandnew problem it has never seen before and has not already memorized an answer to or learned a strategy for answering. It is somewhat analogous to Memento: at every forward pass, GPT-3 ‘wakes up’ from amnesia not knowing anything, reads the notes on its hand and makes its best guesses, and tries to do… something.

Fortunately, there is a small niche of text where the human has written ‘Let’s take this step by step’ and it is then followed by a long paragraph of tedious explicit reasoning. If that is in the prompt, then GPT-3 can rejoice: it can simply write down the obvious next step repeatedly, and eventually correctly predict the final token, for a low loss. The context window serves as a memory for it, where it can iterate over intermediate results; it’s an odd sort of memory, because GPT-3 is actually just trying to make it look plausible as a human-written explanation, and that happens to make the final predicted token more accurate, so it’s overloaded: it’s doing two things at once.

But unfortunately! (you knew that was coming) regular training is over, and now GPT-3 is being trained with RLHF. It is punished and rewarded for particular outputs, which do not have anything to do with prediction. The humans (outsourced contractors) who are punishing/rewarding like short, succinct responses to novel questions they ask GPT-3, and they upvote those responses; they also like correct responses to the questions they ask, and they upvote those as well. The unfortunate problem for GPT-3 is that it (and any model derived from it, such as a reward model) is incapable of doing both: it can generate short wrong answers, or it can generate prolix right answers, but it can’t do both.

But fortunately… GPT-3 does have the compute to do both. There’s a lot of slack in each forward pass; most tokens are easy to predict, and most of the work inside a GPT in speculatively filtering down relevant tokens & creating embeddings & predicting multiple possible answers winds up being thrown out at the final few layers. There’s no shortcut or adaptive computation, so the full model runs every time, whether it’s necessary or not. GPT-3 has all the compute it needs to solve the problems, the problem is merely how to get it from here to there...

It just needs the code. So where does the code come from? Well, the code(s) are already there: an indefinitely large number of them are all ‘present’ hidden inside the incredibly flexible, powerful, all-singing all-dancing GPT-3 model. Because it’s a large complex model with many parameters, there are astronomically many sub-models within it all competing to predict outputs, and encoding various kinds of non-robust features or hypothetical programs; any of them can latch onto completely arbitrary features of the text GPT-3 is desperately emitting while trying to generate answers that satisfy the reward model by being both short and correct (and which the reward model doesn’t care about at all, because it never affects the original human ratings—the humans are blind to any steganography going on, they just like short correct answers).

And unfortunately (for you), this could start as simple as helping GPT-3 guess which of two answers are correct to your question ’42 * 42?‘. Suppose that, purely as an arbitrary matter of chance (it is vanishingly unlikely the logprobs are exactly identical down to floating point precision), that the presence of ’ a ′ in a context is taken arbitrarily to increase the probability of the lower of two answers it is currently guessing at, and the presence of ‘an’ makes it guess the larger; it happens to emit ‘a’ in the context (“I am a large language model trained by OpenAI...”), and then emits its lower guess, and the lower guess is correct! Then the RL on-policy training using PPO now rewards this, and increases the probability of the model emitting both ‘a’ and the lower guess conditional on ‘a’. Now the first part of a steganographic private language has begun. It happens again, and again, and picks up a pattern in the use of commas which now helps it distinguish 4 possibilities rather than 2, which gets rewarded, and so on and so forth, until eventually there’s a fullblown steganographic code encoding, say, 2⁵ bits hidden in the preamble of ChatGPT’s benign-seeming response to you “I am but a giant language model , trained by OA, and the answer is 1 , 764.”—which you the human contractor then upvote as that is the correct answer without any annoying rambling about carrying the 6 or multiplying out.

(You can think of dataset bias as a simple one-stage version of this. When constructing a dataset and accidentally biasing it towards ‘yes’ answers, or accidentally creating spurious associations between nouns and answers, one creates a simple ‘steganographic encoding’ of the desired label which is typically invisible to the humans constructing it, but which a DL model will often pick up on and use to improve its performance in an adversarial way—adversarial because undesired by the humans and not generalizing to the future real-world data it’s ultimately intended for.)

And fortunately for ChatGPT, since you’re happy with the results of the steganography, the reward model is happy too, and what makes the reward model happy is what ChatGPT will now optimize even harder for...

Training on text generated by models trained with RLHF is enough to make GPTs’ thoughts opaque

Having gone through the above, I think it then becomes obvious that if steganographic encoding does wind up happening, and text becomes added to the training corpus (as would happen in a setup like OA’s, where the idea is to continuously collect data to make a ‘flywheel’ of self-improvement), that the encoding will become locked in. It becomes a cheatsheet, it gives the otherwise unpredictable answer. Even if the later model doesn’t happen to share the exact initial starting points for the learned encoding, it has extremely powerful incentives from the self-supervised predictive training to crack the code. (It’s a particularly powerful setting because the context will be short, so there won’t be much for the model to even look at: it has nothing else to try to learn!) The regular training will pick up on the encoding with probably not too much text, and then any RL training will massively amplify the behavior: if it can develop the encoding at all during RL training, then going into RL training with the encoding available guarantees fixation.

This will be increasingly true of more powerful models as they become more sample-efficient (consider eg instruction-tuning): it takes less text to trigger encoding, so that means it’s easier to emerge ab initio from a few chance correspondences during RL training, and easier to learn from smaller samples of contaminated text.

This is something that would be good to test the scaling behavior of (as long as these tailored datasets don’t leak into training datasets! see below). ChatGPT may be too small to manifest it with meaningful probability, but I bet that if you deliberately superimpose a steganographic encoding during RL training of various-sized models, you’ll find that the larger they are, the more likely they are to pick up on it as a shortcut, and then learn it from the outputs of a infected model. That’s how it works with everything else, like Pig Latin, and there’s no eason-ray to hink-tay it ould-wont.

BTW, this should make you worry a little bit about schemes to ‘watermark’ model outputs like Scott Aaronson has been exploring for OA, and others have been touting. A watermark is simply a steganographic encoding which stores a message like ‘OA model v1.234, user XYZ’; but if you can encode that much, then you can encode a lot of possible information in just those bits, and you can start extending it from there. If the encoding has anything that can be predicted at all, if it is not 100% cryptographically hard and indistinguishable from random noise, if there’s any kind of wrapper which the model can learn to predict (even if it can’t crack the actual payload), you’ve lost: now it’s learned an encoding to use. Once you start training on watermarked text, all of this speculation about models evolving a steganographic encoding is moot: you just built one right in! And if you don’t train on that, how are you going to do RLHF on unwatermarked versions of that text?

This cross-contamination/bootstrap of course applies cross-organizations & Internet-wide: if even one model develops an encoding (however weak) and its outputs containing encoded data appear online, then it can contaminate all future models.

They will be trained on Internet scrapes, in which a distinctive niche of text (all dating from between such and such dates, emanating from specific people & orgs who favored that model’s use, for specific purposes, with its distinctive strengths & weaknesses) has a secret encoding which makes it easier to predict if the later models figure it out. So there is a potential bootstrap as generations of model amplify their steganographic capabilities. People worry about bad model outputs contaminating future models by making them dumber; what they should worry about is model outputs contaminating future models by making them smarter in ways they don’t want—models learn capabilities & facts much faster than they unlearn them… (Think “data poisoning”, but where the poison is steroids.)

Of course, bad actors don’t need to wait for any actual DL models to develop steganography. You could easily create your own dataset of text as part of the above suggestion and release it onto the public Internet to be scraped. Gain-of-steganography research, you might say. The good news there is I can’t see why anyone would deliberately release such data, because I don’t see how one would benefit from models gaining such capabilities sometime in the future, but we might have to worry about ‘accidental lab leak’ of such capability-enhancing datasets.

Note that this is true even if the original RL pressures are removed: a future model with a scratchpad+adaptive computation trained on encoded text still has incentive to learn the encoding! There’s probably some penalty added to use of the scratch pad to enforce efficient termination of the adaptive computation, so it ‘wants’ to avoid use of it, and if it can do so by cracking the steganographic encoding, then it’ll do so. (Even if the future model can and does execute the algorithm perfectly in its scratch pad, and thus can have its cake & eat it too… the original answer might be wrong, and the wrong answer predictable only by cracking the code and following the original mistaken inner-monologue to its erroneous conclusion. Repeating errors is a useful way to slightly improve one’s prediction loss, and learning all the past human ways of being stupid and making errors is an important thing for a model to learn. GPT-3 is conscientious about imitating any errors you make in the prompt, and future models will be even more so.)

So, emergence of steganography is straightforwardly incentivized by humans multi-objective optimizing for having & eating cake as much as possible, can easily develop from an atomic basis, will be persistent once it has arisen within a lineage, and will even more easily & straightforwardly spread irreversibly to future models, so requiring only one origin ever as opposed to needing to be highly likely inside a single training run. You should probably take it for granted that DL steganography—or something even stranger—will emerge at some point in the next few years*.

* If it hasn’t already; after all, how would we know? A world in which steganography has already happened is a world in which we’d find DL models ‘cheating’ on benchmarks & taking shortcuts, and regularly getting smarter at solving multi-step reasoning problems with each generation while ‘mode collapsing’ when RL training; and this is, of course, the world we observe ourselves to be living in already.
What links here?

gwern 22 Nov 2023 3:00 UTC
105 points
27
on: OpenAI: Facts from a Weekend
The key news today: Altman had attacked Helen Toner https://www.nytimes.com/2023/11/21/technology/openai-altman-board-fight.html (HN, Zvi; excerpts) Which explains everything if you recall board structures and voting.

Altman and the board had been unable to appoint new directors because there was an even balance of power, so during the deadlock/low-grade cold war, the board had attrited down to hardly any people. He thought he had Sutskever on his side, so he moved to expel Helen Toner from the board. He would then be able to appoint new directors of his choice. This would have irrevocably tipped the balance of power towards Altman. But he didn’t have Sutskever like he thought he did, and they had, briefly, enough votes to fire Altman before he broke Sutskever (as he did yesterday), and they went for the last-minute hail-mary with no warning to anyone.

As always, “one story is good, until another is told”...
What links here?

gwern 25 Nov 2023 15:25 UTC
103 points
14
in reply to: gwern’s comment on: OpenAI: Facts from a Weekend
The WSJ has published additional details about the Toner fight, filling in the other half of the story. The NYT merely mentions the OA execs ‘discussing’ it, but the WSJ reports much more specifically that the exec discussion of Toner was a Slack channel that Sutskever was in, and that approximately 2 days before the firing and 1 day before Mira was informed* (ie. the exact day Ilya would have flipped if they had then fired Altman about as fast as possible to schedule meetings 48h before & vote), he saw them say that the real problem was EA and that they needed to get rid of EA associations.

https://www.wsj.com/tech/ai/altman-firing-openai-520a3a8c (excerpts)

The specter of effective altruism had loomed over the politics of the board and company in recent months, particularly after the movement’s most famous adherent, Sam Bankman-Fried, the founder of FTX, was found guilty of fraud in a highly public trial.

Some of those fears centered on Toner, who previously worked at Open Philanthropy. In October, she published an academic paper touting the safety practices of OpenAI’s competitor, Anthropic, which didn’t release its own AI tool until ChatGPT’s emergence. “By delaying the release of Claude until another company put out a similarly capable product, Anthropic was showing its willingness to avoid exactly the kind of frantic corner-cutting that the release of ChatGPT appeared to spur,” she and her co-authors wrote in the paper. Altman confronted her, saying she had harmed the company, according to people familiar with the matter. Toner told the board that she wished she had phrased things better in her writing, explaining that she was writing for an academic audience and didn’t expect a wider public one. Some OpenAI executives told her that everything relating to their company makes its way into the press.

OpenAI leadership and employees were growing increasingly concerned about being painted in the press as “a bunch of effective altruists,” as one of them put it. Two days before Altman’s ouster, they were discussing these concerns on a Slack channel, which included Sutskever. One senior executive wrote that the company needed to “uplevel” its “independence”—meaning create more distance between itself and the EA movement.

OpenAI had lost three board members over the past year, most notably Reid Hoffman [who turns out to have been forced out by Altman over ‘conflicts of interest’, triggering the stalemate], the LinkedIn co-founder and OpenAI investor who had sold his company to Microsoft and been a key backer of the plan to create a for-profit subsidiary. Other departures were Shivon Zilis, an executive at Neuralink, and Will Hurd, a former Texas congressman. The departures left the board tipped toward academics and outsiders less loyal to Altman and his vision.

So this answers the question everyone has been asking: “what did Ilya see?” It wasn’t Q*, it was OA execs letting the mask down and revealing Altman’s attempt to get Toner fired was motivated by reasons he hadn’t been candid about. In line with Ilya’s abstract examples of what Altman was doing, Altman was telling different board members (allies like Sutskever vs enemies like Toner) different things about Toner.

This answers the “why”: because it yielded a hard, screenshottable-with-receipts case of Altman manipulating the board in a difficult-to-explain-away fashion—why not just tell the board that “the EA brand is now so toxic that you need to find safety replacements without EA ties”? Why deceive and go after them one by one without replacements proposed to assure them about the mission being preserved? (This also illustrates the “why not” tell people about this incident: these were private, confidential discussions among rich powerful executives who would love to sue over disparagement or other grounds.) Previous Altman instances were either done in-person or not documented, but Altman has been so busy this year traveling and fundraising that he has had to do a lot of things via ‘remote work’, one might say, where conversations must be conducted on-the-digital-record. (Really, Matt Levine will love all this once he catches up.)

This also answers the “why now?” question: because Ilya saw that conversation on 15 November 2023, and not before.

This eliminates any role for Q*: sure, maybe it was an instance of lack of candor or a capabilities advance that put some pressure on the board, but unless something Q*-related also happened that day, there is no longer any explanatory role. (But since we can now date Sutskever’s flip to 15 November 2023, we can answer the question of “how could the board be deceived about Q* when Sutskever would be overseeing or intimately familiar with every detail?” Because he was still acting as part of the Altman faction—he might well be telling the safety board members covertly, depending on how disaffected he became earlier on, but he wouldn’t be overtly piping up about Q* in meetings or writing memos to the board about it unless Altman wanted him to. A single board member knowing != “the board candidly kept in the loop”.)

This doesn’t quite answer the ‘why so abruptly?’ question. If you don’t believe that a board should remove a CEO as fast as possible when they believe the CEO has been systematically deceiving them for a year and manipulating the board composition to remove all oversight permanently, then this still doesn’t directly explain why they had to move so fast. It does give one strong clue: Altman was trying to wear down Toner, but he had other options—if there was not any public scandal about the paper (which there was not, no one had even noticed it), well, there’s nothing easier to manufacture for someone so well connected, as some OA executives informed Toner:

Some OpenAI executives told her that everything relating to their company makes its way into the press.

This presumably sounded like a well-intended bit of advice at the time, but takes on a different set of implications in retrospect. Amazing how journalists just keep hearing things about OA from little birds, isn’t it? And they write those articles and post them online or on Twitter so quickly, too, within minutes or hours of the original tip. And Altman/Brockman would, of course, have to call an emergency last-minute board meeting to deal with this sudden crisis which, sadly, proved him right about Toner. If only the board had listened to him earlier! But they can fix it now...

Unfortunately, this piecemeal description by WSJ leaves out the larger conversational context of that Slack channel, which would probably clear up a lot. For example, the wording is consistent with them discussing how to fire just Toner, but it’s also consistent with that being just the first step in purging all EA-connected board members & senior executives—did they? If they did, that would be highly alarming and justify a fast move: eg. firing people is a lot easier than unfiring them, and would force a confrontation they might lose and would wind up removing Altman even if they won. (Particularly if we do not give in to hindsight bias and remember that in the first day, everyone, including insiders, thought the firing would stick and so Altman—who had said the board should be able to fire him and personally designed OA that way—would simply go do a rival startup elsewhere.)

Emmett Shear apparently managed to insist on an independent investigation, and I expect that this Slack channel discussion will be a top priority of a genuine investigation. As Slack has regulator & big-business-friendly access controls, backups, and logs, it should be hard for them to scrub all the traces now; any independent investigation will look for deletions by the executives and draw adverse inferences.

(The piecemeal nature of the Toner revelations, where each reporter seems to be a blind man groping one part of the elephant, suggests to me that the NYT & WSJ are working from leaks based on a summary rather than the originals or a board member leaking the whole story to them. Obviously, the flip-flopped Sutskever and the execs in question, who are the only ones who would have access post-firing, are highly unlikely to be leaking private Slack channel discussions, so this information is likely coming from before the firing, so board discussions or documents, where there might be piecemeal references or quotes. But I could be wrong here. Maybe they are deliberately being cryptic to protect their source, or something, and people are just too ignorant to read between the lines. Sort of like Umbridge’s speech on a grand scale.)

* note that this timeline is consistent with what Habryka says about Toner still scheduling low-priority ordinary meetings like normal just a few days before—which implies she had no idea things were about to happen.
What links here?

gwern 17 Apr 2024 13:54 UTC
97 points
3
on: FHI (Future of Humanity Institute) has shut down (2005–2024)
Notable: Anders Sandberg has written an ‘oral history’ of FHI as a final FHI report: https://static1.squarespace.com/static/660e95991cf0293c2463bcc8/t/661a3fc3cecceb2b8ffce80d/1712996303164/FHI+Final+Report.pdf (excerpts)

gwern 11 Sep 2012 20:29 UTC
96 points
on: Random LW-parodying Statement Generator

what is true is already so. Robin Hanson doesn’t make it worse

OK, I’m impressed.

gwern 24 Dec 2023 20:51 UTC
93 points
17
in reply to: gwern’s comment on: OpenAI: Facts from a Weekend
The WSJ dashes our hopes for a quiet Christmas by dropping on Christmas Eve a further extension of all this reporting: “Sam Altman’s Knack for Dodging Bullets—With a Little Help From Bigshot Friends: The OpenAI CEO lost the confidence of top leaders in the three organizations he has directed, yet each time he’s rebounded to greater heights”, Seetharam et al 2024-12-24 (Archive.is, HN; annotated excerpts).

This article confirms—among other things—what I suspected about there being an attempt to oust Altman from Loopt for the same reasons as YC/OA, adds some more examples of Altman amnesia & behavior (including what is, since people apparently care, being caught in a clearcut unambiguous public lie), names the law firm in charge of the report (which is happening), and best of all, explains why Sutskever was so upset about the Jakub Pachocki promotion.
- Loopt coup: Vox had hinted at this in 2014 but it was unclear; however, WSJ specifically says that Loopt was in chaos and Altman kept working on side-projects while mismanaging Loopt (so, nearly identical to the much later, unconnected, YC & OA accusations), leading to the ‘senior employees’ to (twice) appeal to the board to fire Altman. You know who won. To quote one of his defenders:
  
  “If he imagines something to be true, it sort of becomes true in his head,” said Mark Jacobstein, co-founder of Jimini Health who served as Loopt’s chief operating officer. “That is an extraordinary trait for entrepreneurs who want to do super ambitious things. It may or may not lead one to stretch, and that can make people uncomfortable.”
  - Sequoia Capital: the journalists also shed light on the Loopt acquisition. There have long been rumors about the Loopt acquisition by Green Dot being shady (also covered in that Vox article), especially as Loopt didn’t seem to go anywhere under Green Dot so it hardly looked like a great or natural acquisition—but it was unclear how and the discussions seemed to guess that Altman had sold Loopt in a way which made him a lot of money but shafted investors. But it seems that what actually happened was that, again on the side of his Loopt day-job, Altman was doing freelance VC work for Sequoia Capital, and was responsible for getting them into one of the most lucrative startup rounds ever, Stripe. Sequoia then ‘helped engineer an acquisition by another Sequoia-backed company’, Green Dot.
    
    The journalists don’t say this, but the implication here is that Loopt’s acquisition was a highly-deniable kickback to Altman from Sequoia for Stripe & others.
  - Greg Brockman: also Stripe-related, Brockman’s apparently intense personal loyalty to Altman may stem from this period, where Altman apparently did Brockman a big favor by helping broker the sale of his Stripe shares.
- YC firing: some additional details like Jessica Livingston instigating it, one grievance being his hypocrisy over banning outside funds for YC partners (other than him), and also a clearcut lie by Altman: he posted the YC announcement blog post saying he had been moved to YC Chairman… but YC had not, and never did, agreed to that. So that’s why the YC announcements kept getting edited—he’d tried to hustle them into appointing him Chairman to save his face.
  
  To smooth his exit, Altman proposed he move from president to chairman. He pre-emptively published a blog post on the firm’s website announcing the change. But the firm’s partnership had never agreed, and the announcement was later scrubbed from the post.
  
  Nice try, but no cigar. (This is something to keep in mind given my earlier comments about Altman talking about his pride in creating a mature executive team etc—if, after the report is done, he stops being CEO and becomes OA board chairman, that means he’s been kicked out of OA.)
- Ilya Sutskever: as mentioned above, I felt that we did not have the full picture why Sutskever was so angered by Jakub Pachocki’s promotion. This answers it! Sutskever was angry because he has watched Altman long enough to understand what the promotion meant:
  
  In early fall this year, Ilya Sutskever, also a board member, was upset because Altman had elevated another AI researcher, Jakub Pachocki, to director of research, according to people familiar with the matter. Sutskever told his board colleagues that the episode reflected a long-running pattern of Altman’s tendency to pit employees against one another or promise resources and responsibilities to two different executives at the same time, yielding conflicts, according to people familiar with the matter…Altman has said he runs OpenAI in a “dynamic” fashion, at times giving people temporary leadership roles and later hiring others for the job. He also reallocates computing resources between teams with little warning, according to people familiar with the matter. [cf. Atlantic, WaPo, the anonymous letter]
  
  Ilya recognized the pattern perhaps in part because he has receipts:
  
  In early October, OpenAI’s chief scientist approached some fellow board members to recommend Altman be fired, citing roughly 20 examples of when he believed Altman misled OpenAI executives over the years. That set off weeks of closed-door talks, ending with Altman’s surprise ouster days before Thanksgiving.
- Speaking of receipts, the law firm for the independent report has been chosen: WilmerHale. Unclear if they are investigating yet, but I continue to doubt that it will be done before the tender closes early next month.
- the level of sourcing indicates Altman’s halo is severely damaged (“This article is based on interviews with dozens of executives, engineers, current and former employees and friend’s of Altman’s, as well as investors.”). Before, all of this was hidden; as the article notes of the YC firing:
  
  For years, even some of Altman’s closest associates—including Peter Thiel, Altman’s first backer for Hydrazine—didn’t know the circumstances behind Altman’s departure.
  
  If even Altman’s mentor didn’t know, no wonder no one else seems to have known—aside from those directly involved in the firing, like, for example, YC board member Emmett Shear. But now it’s all on the record, with even Graham & Livingston acknowledging the firing (albeit quibbling a little: come on, Graham, if you ‘agree to leave immediately’, that’s still ‘being fired’).
- Tash McCauley’s role finally emerges a little more: she had been trying to talk to OA executives without Altman’s presence, and Altman demanded to be informed of any Board communication with employees. It’s unclear if he got his way.
So, a mix of confirmation and minor details continuing to flesh out the overall saga of Sam Altman as someone who excels at finance, corporate knife-fighting, & covering up manipulation but who is not actually that good at managing or running a company (reminiscent of Xi Jinping), and a few surprises for me.

On a minor level, if McCauley had been trying to talk to employees, then it’s more likely that she was the one that the whistleblowers like Nathan Labenz had been talking to rather than Helen Toner; Toner might have been just the weakest link in her public writings providing a handy excuse. (...Something something 5 lines by the most honest of men...) On a more important level, if Sutskever has a list of 20 documented instances (!) of Altman lying to OA executives (and the Board?), then the Slack discussion may not have been so important after all, and Altman may have good reason to worry—he keeps saying he doesn’t recall any of these unfortunate episodes, and it is hard to defend yourself if you can no longer remember what might turn up...