I’m out of the loop. Did Daniel Kokotajlo lose his equity or not? If the NDA is not being enforced, are there now some disclosures being made?
mike_hawke
Thanks for the source.
I’ve intentionally made it difficult for myself to log into twitter. For the benefit of others who avoid Twitter, here is the text of Kelsey’s tweet thread:
I’m getting two reactions to my piece about OpenAI’s departure agreements: “that’s normal!” (it is not; the other leading AI labs do not have similar policies) and “how is that legal?” It may not hold up in court, but here’s how it works:
OpenAI like most tech companies does salaries as a mix of equity and base salary. The equity is in the form of PPUs, ‘Profit Participation Units’. You can look at a recent OpenAI offer and an explanation of PPUs here: https://t.co/t2J78V8ee4
Many people at OpenAI get more of their compensation from PPUs than from base salary. PPUs can only be sold at tender offers hosted by the company. When you join OpenAI, you sign onboarding paperwork laying all of this out.
And that onboarding paperwork says you have to sign termination paperwork with a ‘general release’ within sixty days of departing the company. If you don’t do it within 60 days, your units are cancelled. No one I spoke to at OpenAI gave this little line much thought.
And yes this is talking about vested units, because a separate clause clarifies that unvested units just transfer back to the control of OpenAI when an employee undergoes a termination event (which is normal).
There’s a common legal definition of a general release, and it’s just a waiver of claims against each other. Even someone who read the contract closely might be assuming they will only have to sign such a waiver of claims.
But when you actually quit, the ‘general release’? It’s a long, hardnosed, legally aggressive contract that includes a confidentiality agreement which covers the release itself, as well as arbitration, nonsolicitation and nondisparagement and broad ‘noninterference’ agreement.
And if you don’t sign within sixty days your units are gone. And it gets worse—because OpenAI can also deny you access to the annual events that are the only way to sell your vested PPUs at their discretion, making ex-employees constantly worried they’ll be shut out.
Finally, I want to make it clear that I contacted OpenAI in the course of reporting this story. So did my colleague SigalSamuel They had every opportunity to reach out to the ex-employees they’d pressured into silence and say this was a misunderstanding. I hope they do.
Even acknowledging that the NDA exists is a violation of it.
This sticks out pretty sharply to me.
Was this explained to the employees during the hiring process? What kind of precedent is there for this kind of NDA?
There are things I would buy if they existed. Is there any better way to signal this to potential sellers, other than tweeting it and hoping they hear? Is there some reason to believe that sellers are already gauging demand so completely that they wouldn’t start selling these things even if I could get through to them?
Would I somehow feel this problem less acutely if I had never been taught Fahrenheit, Celcius, or Kelvin; and instead been told everything in terms of gigabytes per nanojoule? I guess probably not. Inconvenient conversions are not preventing me from figuring out the relations and benchmarks I’m interested in.
It’s important to remember, though, that I will be fine if I so choose. After all, if the scary impression was the real thing then it would appear scary to everyone.
Reading this makes me feel some concern. I think it should be seriously asked: Would you be fine if you hypothetically chose to take a gap year or drop out? Those didn’t feel like realistic options for me when I was in high school and college, and I think this ended up making me much less fine than I would have been otherwise. Notably, a high proportion of my close friends in college ended up dropping out or having major academic problems, despite being the smartest and most curious people I could find.
My experiences during and after college seemed to make a lot more sense after hearing about ideas like credential inflation, surplus elites, and the signaling model. It seems plausible that I might have made better decisions if I had been encouraged to contemplate those ideas as a high schooler.
In measuring and communicating about the temperature of objects, humans can clearly and unambiguously benchmark things like daily highs and lows, fevers, snow, space heaters, refrigerators, a cup of tea, and the wind chill factor. We can place thermometers and thereby say which things are hotter than others, and by how much. Daily highs can overlap with fevers, but neither can boil your tea.
But then I challenge myself to estimate how hot a campfire is, and I’m totally stuck.
It feels like there are no human-sensible relationships once you’re talking about campfires, self-cleaning ovens, welding torches, incandescent filaments, fighter jet exhaust, solar flares, Venus, Chernobyl reactor #4, the anger of the volcano goddess Pele, fresh fulgurites, or the boiling point of lead. Anything hotter than boiling water has ascended into the magisterium of the Divinely Hot, and nothing more detailed can be said of them by a mortal. If I were omnipotent, omniscient, & invulnerable, then I could put all those things in contact with each other and then watch which way the heat flows. But I am a human, so all I can say is that anything on that list could boil water.
Presumably he understood the value proposition of cryonics and declined it, right?
If everyone in town magically receives the same speedup in their “verbal footwork”, is that good for meta-honesty? I would like some kind of story explaining why it wouldn’t be neutral.
Point for yes:
Sure seems like being able to quickly think up an appropriately nonspecific reference class when being questioned about a specific hypothetical does not make it harder for anyone else to do the same.Point against:
The code of literal truth only lets people navigate anything like ordinary social reality to the extent that they are very fast on their verbal feet, and can respond to the question “How are you?” by saying “Getting along” instead of “Horribly” or with an awkward silence while they try to think of something technically true.
This particular case seems anti-inductive and prone to the euphemism treadmill. Indeed, one person one time can navigate ordinary social reality by saying “Getting along” instead of giving an awkward silence; but many people doing so many times will find that it tends to work less well over time. If everyone magically becomes faster on their verbal feet, they can all run faster on the treadmill, but this isn’t necessarily good for meta-honesty.
Implications: either cognitive enhancement becomes even more of a moral priority, or adhering to meta-honesty becomes a trustworthy signal of being more intelligent than those who don’t. Neither outcome seems terrible to me, nor even all that much different from the status quo.
One concrete complaint I have is that I feel a strong incentive toward timeliness, at the cost of timelessness. Commenting on a fresh, new post tends to get engagement. Commenting on something from more than two weeks ago will often get none, which makes effortful comments feel wasted.
I definitely feel like there is A Conversation, or A Discourse, and I’m either participating in it during the same week as everyone else, or I’m just talking to myself.
(Aside: I have a live hypothesis that this is tightly related to The Twitterization of Everything.)
Glad to see some discussion of social class.
Here’s something in the post that I would object to:
Non-essential weirdnesses, on the other hand, should be eliminated as much as possible because pushing lifestyle choices onto disinterested working-class people is a misuse of class privilege. Because classes are hierarchical in nature, this is especially important for middle-upper class people to keep in mind. An example of non-essential weirdness is “only having vegan options for dinner”.
This example seems wrong to me. It seems like serving non-vegan options does in fact risk doing a great injustice (to the animals eaten). I tried and failed to think of an example that seemed correct, so now I’m feeling pretty unconvinced by the entire concept.One contrary idea might be that class norms and lifestyle choices are usually load-bearing, often in ways that are deliberately obscured or otherwise non-obvious. Therefore, one may want to be cautious when labeling something a non-essential weirdness.
(Also maybe worth mentioning that I think class phenomena are in general anti-inductive and much harder to reach broad conclusions about than other domains.)
Most people, even most unusually honest people, wander about their lives in a fog of internal distortions of reality. Repeatedly asking yourself of every sentence you say aloud to another person, “Is this statement actually and literally true?”, helps you build a skill for navigating out of your internal smog of not-quite-truths. For that is our mastery.
I think some people who read this post ought to reverse this advice. The advice I would give to those people is: if you’re constantly forcing every little claim you make through a literalism filter, you might end up multiplying disfluencies and generally raising the cost of communicating with you. Maybe put a clause limit on your sentences and just tack on a generic hedge like “or something” if you need to.
Only praise yourself as taking ‘the outside view’ if (1) there’s only one defensible choice of reference class;
I think this point is underrated. The word “the” in “the outside view” is sometimes doing too much work, and it is often better to appeal to an outside view, or multiple outside views.
What do you think the internal experience of these liars is like? I could believe that some of them have gotten a lot of practice with fooling themselves in order to fool others, in settings where doing so is adaptive. Do you think they would get different polygraph results than the believer in the invisible dragon hypothetically would?
Damn, woops.
My comment was false (and strident; worst combo). I accept the strong downvote and I will try to now make a correction.
I said:
I spent a bunch of time wondering how you could could put 99.9% on no AI ever doing anything that might be well-described as scheming for any reason.
What I meant to say was:I spent a bunch of time wondering how you could put 99.9% on no AI ever doing anything that might be well-described as scheming for any reason, even if you stipulate that it must happen spontaneously.
And now you have also commented:
Well, I have <0.1% on spontaneous scheming, period. I suspect Nora is similar and just misspoke in that comment.
So....I challenge you to list a handful of other claims that you have similar credence in. Special Relativity? P!=NP? Major changes in our understanding of morality or intelligence or mammal psychology? China pulls ahead in AI development? Scaling runs out of steam and gives way to other approaches like mind uploading? Major betrayal against you by a beloved family member?
The OP simply says “future AI systems” without specifying anything about these systems, their paradigm, or what offworld colony they may or may not be developed on. Just...all AI systems henceforth forever. Meaning that no AI creators will ever accidentally recapitulate the scheming that is already observed in nature...? That’s such a grand, sweeping claim. If you really think it’s true, I just don’t understand your worldview. If you’ve already explained why somewhere, I hope someone will link me to it.
Foregone mutually beneficial trades sometimes provide value in the form of plausible deniability.
If a subculture started trying to remove barriers to trade, for example by popularizing cheerful prices, this might have the downside of making plausible deniability more expensive. On net that might be good or bad (or weird), but either way I think it’s an underrated effect (because I also think that the prevalence and load-bearing functions of plausible deniability are also underrated). People have prospects and opportunity costs, often largely comprising things that are more comfortable to leave unsaid.
(Continuing from this comment.)
EDIT: This is wrong. See descendent comments.
I spent a bunch of time wondering how you could could put 99.9% on no AI ever doing anything that might be well-described as scheming for any reason. I was going to challenge you to list a handful of other claims that you had similar credence in, until I searched the comments for “0.1%” and found this one.I’m annoyed at this, and I request that you prominently edit the OP.
I followed this exchange up until here and now I’m lost. Could you elaborate or paraphrase?
I will push against.
I feel unhappy with this post, and not just because it called me an idiot. I think epithets and thoughtless dismissals are cheap and oversupplied. Patience and understanding are costly and undersupplied.
A lot of the seemingly easy wins in Mark’s list were not so easy for me. Becoming more patient helped me a lot, whereas internal vitriol made things worse.I benefitted hugely from Mr. Money Mustache, but I think I was slower to implement his recommendations because he kept calling me an idiot and literally telling me to punch myself in the face.
If a bunch of people get enduring benefits from adopting the “such an idiot” frame, then maybe I’ll change my mind. (They do have to be enduring though.)
Here is a meme I would be much happier to see spread:
You, yes you might be able to permanently lower the cost of exercise to yourself if you spend a few days’ worth of discretionary resources on sampling the sports in Mark Xu’s list. But if you do that and it doesn’t work, then ok, maybe you really are one of the metabolically underprivileged, and I hope you figure out some alternative.
Side notes:
It seems like this post is in tension with Beware Other Optimizing. And perhaps also a bit with Do Life Hacks Ever Reach Fixation? Not exactly, because Mark’s list mostly relies on well-established life upgrades. But insofar as there is a tension here, I will tend to take the side of those two posts.
Perhaps this is a needless derail, and if so I won’t press it, but I’m feeling some intense curiosity over whether Mark Xu and Critch would agree about whether Critch at all qualifies as an idiot. According to Raemon, Critch recently said, “There aren’t things lying around in my life that bother me because I always notice and deal with it.”
I find something both cliche and fatalistic about the notion that lots of seemingly maladaptive behaviors are secretly rational. But indeed I have had to update quite a few times in that direction over the years since I first started reading LessWrong.
Without passing judgment on this, I think it should be noted that it would have seemed less out of place when the Sequences were fresh. At that time, the concept of immaterial souls and the surrounding religious memeplexes seemed to be a genuinely interfering with serious discussion about minds.
However, and relatedly, there was not a lot of cooking discussion on LW in 2009, and this tag was created in 2020.