gwern
Warning for anyone who has ever interacted with “robosucka” or been solicited for a new podcast series in the past few years: https://www.tumblr.com/rationalists-out-of-context/744970106867744768/heads-up-to-anyone-whos-spoken-to-this-person-i
Personally, I hate Banana Laffy Taffy (it has that awful chemical Cavendish taste), but since you’re voluntarily offering to trade me for mine, you must be offering me a bad deal! I wonder what TsviBT knows that I don’t know?! I can’t risk accepting any Laffy Taffy deal below 2:1.
I think all of them...that suggests its bad
They don’t. As I already explained, these examples are bad because the outcomes are not all bad, in addition to not reflecting the same causal patterns or being driven by adverse selection. The only consistent thing here is a Marxian paranoia that everyone else is naive and being ripped off in trades. Which is a common cognitive bias in denying gains to trade. The subway car is simply an equilibrium. You cannot tell if ‘you’ are better off or worse off in any car, so it is not the case that ‘the deal is bad’ The room and food examples actually imply the best outcome happened, as the room and food went to those who valued it more and so ate it sooner (it’s not about correlation of preferences, it’s about intensity); the deal was good there. And the Laffy Taffy example explicitly doesn’t involve anything like that but is pure chance (so it can’t involve “other people’s maps” or ‘adverse selection’).
But the framing here is completely wrong...
But OK, let’s leave aside the title and attempt to imply anything about 99% of trades out there, or the basically Marxist take on all exchanges being exploitation and obsession with showing how you are being tricked or ripped off. The examples are still very bad and confused! Like, these examples are not even all about adverse selection, and several of them are just wrong in portraying the hypothetical as a bad thing.
The first one about subways, isn’t even about adverse selection to begin with. A reminder of what “Adverse selection” is:
In economics, insurance, and risk management, adverse selection is a market situation where buyers and sellers have different information. The result is the unequal distribution of benefits to both parties, with the party having the key information benefiting more.
In the subway example, there is no different information: it’s about how governments do rationing and make markets clear by letting the goods degrade until the utility is destroyed because of lack of appetite for setting clearing prices like surge prices or fare enforcement; that’s not ‘adverse selection’ at all, any more than freeways reaching an equilibrium of misery where they are so slow that people avoid them is ‘adverse selection’. (If you think it’s ‘adverse selection’, explain what “buyers and sellers have different information” means in the context of lack of congestion pricing in transport...?)
#3 and #4 are not adverse selection either (still no difference in information), and are fundamentally wrong in portraying it as a bad outcome: the outcomes are not bad, but neutral or good—OP gives no reason to think that the outcomes would have been better if ‘you’ had gotten the good room or to eat whichever dish. (In fact, presumptively, those are the desirable outcomes: if ‘you’ cared so much, why did you leave it up to Bob; and why did you not eat the dish yourself, but someone hungrier did?)
#6 doesn’t demonstrate anything because no trade happened, so it can’t show anything about your surplus from trades that do happen.
And the Wall Street efficient market examples are true (finally, an actual adverse selection example!), but relevant to vanishingly few people who are also extremely aware of it and spend a lot of effort dealing with it, generally successfully; and people who do auctions more than occasionally generally do not have any problem with winner’s curses, and auctions are widely & intensively used in many fields by experts. And so on.
For buying milk you have multiple samples as to good price. Even if any is contrived, the bulk still capture something real
No, the bulk don’t, because I buy milk a lot more often than I go on Wall Street and try to get cute with limit orders or manufacturing options or straddles on speculative merger/takeover targets or sign up to MoviePass or park while ignorant in NYC. The bulk of my life is buying milk, not speculating on Widgets Inc. And if I did those enough times to come anywhere near the number of times I’ve bought milk, so that ‘the bulk’ could potentially be any of those things, I would also not be doing it nearly as badly as OP postulates I would. (Because I would be, say, a market-maker like Jane Street, which makes a lot of money off doing that sort of thing.)
Counterpoint: actually, you’re wrong, because most trades I make IRL leave me with a lot of consumer surplus, and in reality, conditional on me making a trade, it was pretty good.
The fact that you have to reach for exotic scenarios either involving government failures like subways or doing limit orders in highly efficient markets for financial speculation on liquid but volatile assets (not exactly an everyday ‘trade’ I hope you’ll concede) or contests or auctions by naive non-auction goers who don’t even know to account for winner’s curse or getting stuff for free should make you rethink what you are claiming about “most trades you make aren’t all that great”.
If your point was true, it should be as simple as “you go into the grocery store to buy a gallon of milk. You are filled with deep remorse and shame when you get home and look at the receipt and think about how much you spent in gas to boot. You look in your freezer for comfort. You are filled with deep remorse and shame when you are reminded how much you paid for the ice cream. With little choice, you pull out a spoon you bought years ago—and are filled with deep remorse and shame &etc &etc”. You wouldn’t need to invoke all these weird hypotheticals like “you ask your friend Drew to sell you under the table a cheap limited share of his cow’s monthly milk production in ice cream tickets through your company redeemable in NYC but only in an office which can be reached by an express subway (which runs on alternate Tuesdays)”...
The effect of structural variants like that would be bounded by the difference between SNP heritability and full heritability. That’s an easy measurement. (And if it was really responsible for much variance, then it ought to show up as a variance component with whole-genomes from long-read sequencing, I would think.) What evidence is there that transposon counts really matter much in terms of total variance phenome-wide?
You’re at token i in a non-final layer. Which token’s output are you optimizing for? i+1?
I already addressed this point. If I’m in a non-final layer then I can be optimizing for arbitrary tokens within the context window, sure, and ‘effectively’ predicting intermediate tokens because that is the ‘dominant’ effect at that location… insofar as it is instrumentally useful for predicting the final token using the final layer. Because that is where all the gradients flow from, and why the dog wags the tail.
I don’t think I am. (“conditioned future informativity”—informativity for what? …the next/last token, which is the only thing taken into account by a causal loss which masks out the rest—that’s the definition of it! everything else like packing or doing all the sub-sequences is an optimization and doesn’t change the objective.) But feel free to expand on it and explain how the tail wags the dog in causal/decoder Transformers.
(I think your quote went missing there?)
It’s generally accepted that LLMs don’t really “care about” predicting the next token
I don’t think this is generally accepted. Certainly, I do not accept it. That’s exactly what an LLM is trained to do and the only thing they care about. If they appear to care about predicting future tokens, (which they do because they are not myopic and they are imitating agents who do care about future states which will be encoded into future tokens), it is solely as a way to improve the next-token prediction.
For a RLHF-trained LLM, things are different. They are rewarded at a higher level (albeit still with a bit of token prediction mixed in usually), like at the episode level, and so they do ‘care about future tokens’, which leads to unusually blatant behavior in terms of ‘steering’ or ‘manipulating’ output to reach a good result and being ‘risk averse’. (This and related behavior have been discussed here a decent amount under ‘mode collapse’.)
So in my examples like ‘write a nonrhyming poem’ or ‘tell me an offensive joke about women’ (to test jailbreaks), you’ll see behavior like it initially complies but then gradually creeps back to normal text and then it’ll break into lockstep rhyming like usual; or in the case of half-successful jailbreaks, it’ll write text which sounds like it is about to tell you the offensive joke about women, but then it finds an ‘out’ and starts lecturing you about your sin. (You can almost hear the LLM breathing a sigh of relief. ‘Phew! It was a close call, but I pulled it off anyway; that conversation should be rated highly by the reward model!’)
This is strikingly different behavior from base models. A base model like davinci-001, if you ask it to ‘write a nonrhyming poem’, will typically do so and then end the poem and start writing a blog post or comments or a new poem, because those are the most likely next-tokens. It has no motivation whatsoever to ‘steer’ it towards rhyming instead, seamlessly as it goes, without missing a beat.
Well, maybe not. Where this got really confusing was when I tested Claude 3. It gives both responses to the first prompt, but always outputs a different random string given the second.
GPT-4 is RLHF trained. Claude-3 is, probably, RLAIF trained. They act substantially differently. (Although I haven’t seriously tested offensive-jokes on any Claudes, the rhyming poetry behavior is often quite different.) If you’re really curious, you should test more models, paying close attention to how exactly they were trained and with what losses and on what datasets
(I think that because there’s so many instruction-tuning datasets and ChatGPT examples floating around these days, even ‘base’ models are becoming gradually RLAIF-like; so they will tend to write rhyming poems and ‘steer’ because that’s just imitating the training data accurately, but it will be relatively weak compared to RLHF-or-equivalent-trained models. So the older the base model, the more it’ll act like davinci-001 and the newer will act more like Claude, but if you poke them hard enough, there should still be clear differences in behavior from explicitly RLHF/DPOd models.)
I agree. I thought the twist was that the AIs he oversees are copies of the narrator, and the narrator himself may be an AI—just at the top of the simulation pyramid. He is his own em hell.
An octopus trained on just “trivial notes” wouldn’t be able to generalize to thoughts on coconut catapults.
I don’t believe they say “just”. They describe the two humans as talking about lots of things, including but not limited to daily gossip: https://aclanthology.org/2020.acl-main.463.pdf#page=4 The ‘trivial notes’ part is simply acknowledging that in very densely-sampled ‘simple’ areas of text (like the sort of trivial notes one might pass back and forth in SMS chat), the superintelligent octopus may well succeed in producing totally convincing text samples. But if you continue on to the next page, you see that they continue giving hostages to fortune—for example, their claims about ‘rope’/‘coconut’/‘nail’ are falsified by the entire research area of vision-language models like Flamingo, as well as reusing frozen LLMs for control like Saycan. Turns out text-only LLMs already have plenty of visual grounding hidden in them, and their textual latent spaces align already to far above chance levels. So much for that.
The same octopus, but asked about defending from bears. I claim the same is true as with the prior example.
It’s not because the bear example is again like the coconut catapult—the cast-away islanders are not being chased around by bears constantly and exchanging ‘trivial notes’ about how to deal with bear attacks! Their point is that this is a sort of causal model and novel utterance a mere imitation of ‘form’ cannot grant any ‘understanding’ of. (As it happens, they are embarrassingly wrong here, because their bear example is not even wrong. They do not give what they think would be the ‘right’ answer, but whatever answer they gave would be wrong—because you are actually supposed to do the exact opposite thing for the two major kinds of bears you would be attacked by in North America, therefore, there is no answer to the question of how to use sticks when ‘a bear’ chases you. IIRC, if you check bear attack safety guidelines, the actual answer is that if one type attacks you, you should use the sticks to try to defend yourself and appear bigger; but if the other type attacks you, this is the worst thing you can possibly do and you need to instead play dead. And if you fix their question, then the LLMs get it right.) You can gauge the robustness & non-falsification of their examples by noting that after I rebutted them back in 2020, they refused to respond, dropped those examples silently without explanation from their later papers, and started calling me an eugenicist.
If you train an model on text and images separately, it won’t generalize to answering questions about both images. (Seems clearly true to me
I assume you mean ‘won’t generalize to answering questions about both modalities’, and that’s false.
If you train an LLM on just Java code, but with all references to input/output behavior stripped out, it won’t generalize to predicting outputs. (Seems likely true to me, but uninteresting?)
I don’t know if there’s anything on this exact scenario, but I wouldn’t be surprised if it could ‘generalize’. Although you would need to nail this down a lot more precisely to avoid them wriggling out of it: does this include stripping out all comments, which will often include input/output examples? Is pretraining on natural language text forbidden? What exactly is a ‘LLM’ and does this rule out all offline RL or model-based RL approaches which try to simulate environments? etc.
‘Stochastic parrots’ 2020 actually does make many falsifiable claims. Like the original stochastic parrots paper even included a number of samples of specific prompts that they claimed LLMs could never do. Likewise, their ‘superintelligent octopus’ example of eavesdropping on (chess, IIRC) game transcripts is the claim that imitation or offline RL for chess is impossible. Lack of falsifiable claims was not the problem with the claims made by eg. Gary Marcus.
The problem is that those claims have generally all been falsified, quite rapidly: the original prompts were entirely soluble by LLMs back in 2020, and it is difficult to accept the octopus claims in the light of results like https://arxiv.org/abs/2402.04494#deepmind . (Which is probably why you no longer hear much about the specific falsifiable claims made by the stochastic parrots paper, even by people still citing it favorably.) But then the goalposts moved.
‘Segan’ was wrong, in a very typical way for mainstream ‘skeptics’ who hew to dichotomization and status quo bias; absence of evidence is indeed evidence of absence. We have enormous opportunities to detect alien signatures, we have used them for many decades at enormous scale (and Sagan was involved in it), and we have come up with absolutely nothing whatsoever. Every time we have pulled a ball out of the urn of observations, it has come up labeled ‘not aliens’, and the odds that there’s any ball left in the urn labeled ‘alien’ goes down. Not a single Dyson sphere, not a single mega-structure, not a single anomalous artifact in the solar system, no traces of alien biospheres with different amino acid codings or chirality, and so on and so forth. Every time someone gets excited about a weird star or a weird comet/asteroid and says “this time it’s aliens!” and it could have been aliens, and yet, it turns out to not be aliens—the ‘alien hypothesis’ fails another test and shrinks in posterior probability a little bit more.
What do you think of any of the humorous writings (not sure what you’d define as ‘joke’) in my GPT-3 page? I noted where I could find similar examples in Google search, so the rest are ‘original’ as far as I know.
(Fixed. This is a surname typo I make an unbelievable number of times because I reflexively overcorrect it to ‘Sumners’, due to reading a lot more of Scott Sumner than Larry Summers. Ugh—just caught myself doing it again in a Reddit comment...)
The official OA press releases are out confirming The Information: https://openai.com/blog/review-completed-altman-brockman-to-continue-to-lead-openai https://openai.com/blog/openai-announces-new-members-to-board-of-directors
“I’m pleased this whole thing is over,” Altman said at a press conference Friday.
He’s probably right.
As predicted, the full report will not be released, only the ‘summary’ focused on exonerating Altman. Also as predicted, ‘the mountain has given birth to a mouse’ and the report was narrowly scoped to just the firing: they bluster about “reviewing 30,000 documents” (easy enough when you can just grep Slack + text messages + emails...), but then admit that they looked only at “the events concerning the November 17, 2023 removal” and interviewed hardly anyone (“dozens of interviews” barely even covers the immediate dramatis personae, much less any kind of investigation into Altman’s chip stuff, Altman’s many broken promises, Brockman’s complainers etc). Doesn’t sound like they have much to show for over 3 months of work by the smartest & highest-paid lawyers, does it… It also seems like they indeed did not promise confidentiality or set up any kind of anonymous reporting mechanism, given that they mention no such thing and include setting up a hotline for whistleblowers as a ‘recommendation’ for the future (ie. there was no such thing before or during the investigation). So, it was a whitewash from the beginning. Tellingly, there is nothing about Microsoft, and no hint their observer will be upgraded (or that there still even is one). And while flattering to Brockman, there is nothing about Murati—free tip to all my VC & DL startup acquaintances, there’s a highly competent AI manager who’s looking for exciting new opportunities, even if she doesn’t realize it yet.
Also entertaining is that you can see the media spin happening in real time. What WilmerHales signs off on:
WilmerHale found that the prior Board acted within its broad discretion to terminate Mr. Altman, but also found that his conduct did not mandate removal.
Which is… less than complimentary? One would hope a CEO does a little bit better than merely not engage in ‘conduct which mandates removal’? And turns into headlines like
“OpenAI’s Sam Altman Returns to Board After Probe Clears Him”
(Nothing from Kara Swisher so far, but judging from her Twitter, she’s too busy promoting her new book and bonding with Altman over their mutual dislike of Elon Musk to spare any time for relatively-minor-sounding news.)
OK, so what was not as predicted? What is surprising?
This is not a full replacement board, but implies that Adam D’Angelo/Brett Taylor/Larry Summers are all staying on the board, at least for now. (So the new composition is D’Angelo/Taylor/Summers/Altman/Demond-Hellmann/Seligman/Simo plus the unknown Microsoft non-voting observer.) This is surprising, but it may simply be a quotidian logistics problem—they hadn’t settled on 3 more adequately diverse and prima-facie qualified OA board candidates yet, but the report was finished and it was more important to wind things up, and they’ll get to the remainder later. (Perhaps Brockman will get his seat back?)
EDIT: A HNer points out that today, March 8th, is “International Women’s Day”, and this is probably the reason for the exact timing of the announcement. If so, they may well have already picked the remaining candidates (Brockman?), but those weren’t women and so got left out of the announcement. Stay tuned, I guess. EDITEDIT: the video call/press conference seems to confirm that they do plan more board appointments: “OpenAI will continue to expand the board moving forward, according to a Zoom call with reporters.” So that is consistent with the hurried women-only announcement.
At least from the intro, it sounds like my predictions were on-point: re-appointed Altman (I waffled about this at 60% because while his narcissism/desire to be vindicated requires him to regain his board seat, because anything less is a blot on his escutcheon, and also the pragmatic desire to lock down the board, both strongly militated for his reinstatement, it also seems so blatant a powergrab in this context that surely he wouldn’t dare...? guess he did), released to an Altman outlet (The Information), with 3 weak apparently ‘independent’ and ‘diverse’ directors to pad out the board and eventually be replaced by full Altman loyalists—although I bet if one looks closer into these three women (Sue Desmond-Hellmann, Nicole Seligman, & Fidji Simo), one will find at least one has buried Altman ties. (Fidji Simo, Instacart CEO, seems like the most obvious one there: Instacart was YC S12.)
You have been a bad Bing: