I think it would be good if it felt to me like this post was very aware of the lines around honesty, why it was so important, and made a case for this being an exception. Like, corporate sabotage is potentially valid in some cases, but doing it to anyone you think it unethical is far too low a bar, and that equilibrium is just great societal dysfunction.
I’ve seen a lot of posts this week saying that employees are morally obligated to quit OpenAI immediately. But I wouldn’t go that far: I’d only say that you’re obligated to stop doing good work.
Just to be clear, quitting a place because you disagree with the direction it has taken is a far lower bar than staying there and dishonorably trying to screw things up.
Really, why would you care if you put in less effort and OpenAI eventually fires you?
Uh, because I’m not a person who screws people over whenever it’s convenient to me. (Strong-downvoted.)
This reads to me as obvious self-deception. Did you not make an agreement with the staffer who hired you that you would work in OpenAI’s interests while there? Do you not each day set the implicit expectation with your colleagues that you’re on the same team?
I reflected years ago that all contracts come with a hidden, secret clause, which is that “I am not trying to screw you over with this agreement”. You can be the sort of player who doesn’t have that, but this means I don’t want to make most deals with you, because I would have to put I so much extra work to make sure you’re not screwing me over.
Telling someone you’re secretly screwing them over for their own good… is not an honest or honorable way of interfacing with someone, and should not be the norm, including for people who you have severe disagreements with.
When hired by an employer, we agree to do certain work in exchange for compensation, not to optimize for the employer’s interests or what the CEO thinks the employer’s interests are. The implicit expectation with my colleagues is that I’m on their team, not necessarily the company’s. I work in my employer’s interests because I care about maximizing impact, because I take pride in my work, and because I explicitly told my manager I would finish a certain project this week.
In my view the implicit expectation you have of people by default is fairly weak, and signing a contract doesn’t change this much. In fact, the point of a contract is to make obligations explicit so we don’t have to rely on implicit trust.
When hired by an employer, we agree to do certain work in exchange for compensation, not to optimize for the employer’s interests or what the CEO thinks the employer’s interests are.
Actually it’s common for great companies to have visions that the people believe in that they make part of the hiring and onboarding process, and explicitly label and talk through (e.g. SpaceX’s “we’re going to Mars” or Stripe’s “Increase the GDP of the internet”). I think this is good, and I strongly expect that this is part of the culture Altman has set at OpenAI, so I expect it is much more of an implicit agreement there than it is if you (say) work at a restaurant as a waiter.
There are many equilibriums about how much people expect others to believe in the company that they work at. I guess I am coming at this from a culture where people work at a place because they believe in it, and I think this is a better equilibrium.
It’s something of an empirical question how good the existing companies are and how feasible it is to only work at places where you believe you’re improving the world. But it does seem to me that, if you’re at OpenAI but think it’s harmful to the world, you can just leave and make decent money elsewhere, I don’t think anyone is particularly trapped at the job there.
I guess I see these visions more as things companies try to filter for, inculcate, and perhaps require of executives, rather than ideologies that a rank-and-file engineer is ethically required to adopt. Maybe Lightcone and SpaceX are exceptions, but employees at most companies have a variety of reasons for working there. I’d guess the most common motivation for AI engineers is money. Is it dishonorable for a cracked IC at OpenAI to take a promotion to manager where they’re less effective?
Ok, what if they are motivated by OpenAI’s stated mission: “to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity”? It doesn’t say you should defer to Sam Altman and act as if endowing GPT-5.5 with the capability to spy on Americans benefits all of humanity. While I don’t agree with everything in the OP, it seems perfectly reasonable for an OpenAI employee who wants to benefit all of humanity to take protest actions, including slacking off at work and focusing on office politics if this is better than quitting. Why not just leave? Well, you could become a whistleblower, or the office politics could pay off and let you influence OpenAI for the better.
OpenAI leadership broke that implicit contract first. It was originally supposed to be a philanthropic thing for the benefit of humanity. It was supposed to be “open”. Then it became for-profit, now it’s going to work on killer robots for the military. To whatever extent there’s an implicit contract like you describe, it would also apply to the work that people previously did for it under false pretenses!
That seems like a legit point! It’s just very important to include information like that in your post when you’re proposing corporate sabotage. In political and murky domains where there’s so much noise and adversarial spreading of false narratives, it’s really scary to see posts that just say “I’m sure we all agree <thing> is bad, now let us do underhanded and dishonorable things to attack <thing>”. You can write that independent of whether something is good or bad, and instead write it whenever something is unpopular. It’s both more truth-tracking and less scary to see someone write “I encourage you to do underhanded things to <thing> because of <list of underhanded things that we know they did>.”
You open the post with a list of things that, while bad, are at best reason to quit and protest the company, not reasons to be dishonorable. This section, as far as I can tell, was about as much as you spent actually justifying corporate sabotage:
There is more open debate than I thought ther ewould be, at least in this part of Twitter, about whether we should prefer a democratically elected government or unelected private companies to have more power. I guess this is something people disagree on, but…I don’t. This seems like an important area for more discussion.
Let’s be clear: this was not about Anthropic telling the US military not to work on autonomous weapons on its own. Altman is advocating for the government being able to require private companies (and their employees) to provide whatever services it wants, even if they don’t currently do that thing. I know the term “fascism” has been thrown around a lot, but that is Actual Fascism. Here are some other ways to use that argument:
“Why should Sam Altman decide what should be done with that billion dollars instead of the government, which reflects the will of the people?”
“Why should a private citizen get to decide they don’t want to spy on their neighbors and report any hidden jews? That should be the decision of the government, which reflects the will of the people!”
This jump to ‘fascism’ is just cheap. Are you aware that Altman has repeatedly and publicly stated:
In my conversations over the weekend, I reiterated that Anthropci should not be designated as a SCR, and that we hope the DOW offers them the same terms we’ve agreed to.
Insofar as him endorsing the government’s threat against a private company is the ‘fascism’ you’re accusing, he has spoken out against it. You may wish to make some argument about why these words are not representative of the actions he will take, but instead you decided it was fine to label someone ‘Actual Fascism’ with capitals. Please hold yourself to higher standards than this.
The biggest problem that I have with the post is that quitting your job is considered a bigger deal than giving up on being an honorable person. Seems very far out from what I consider good behavior. I myself have gone and protested outside of OpenAI due to them racing to develop AGI while being well aware that this poses an extinction-level threat to humanity, so I know is quite possible to oppose a company without acting dishonorably in the process. If you work at OpenAI and no longer believe in the ethics of the company, you can just do the decent thing and quit.
I’m sure it’s possible to write a better version of this post. I hope someone does. Believe it or not, my specialty is engineering, not rhetoric.
My assessment of Sam Altman is that he’s a very good actor, very untrustworthy, and a nihilistic power-seeker who cares very little about benefit or harm to humanity as a whole. I agree that this post alone is only weak support for that assessment. A proper “compendium of reasons not to trust Sam Altman” would probably end up being a considerably longer post.
No, it’s screwing people over because you’ve committed to doing something, and that something screws over people. If there was no honorable way out, that would be a difficult moral question; benevolence vs. trustworthiness. But there’s an easy way out: employment is at will and you can simply leave.
There are two distinct desirable properties to have, ethically. (Probably more than two, but two that I can point to here.) One is benevolence: to do, and seek to do, things that are good for people; people in your circle of concern, typically, but that could be everyone who will ever live or just your family or in between. It’s also benevolent to expand your circle of concern on purpose. The other, trustworthiness, is to deal fairly with people when benevolence doesn’t require it. And you might say ‘all humans alive are in my circle, benevolence covers everyone the second one doesn’t matter.’ But this is not correct, because most of those people don’t trust that they are within your circle. They shouldn’t! You have not given them reason to. It’s very easy to say you care about all humanity—observe the case of 2015!Sam Altman! He even looked like he was paying costs for the belief! - and very hard to prove it—again, see how he did not, at all, care about it, and screwed over everyone who founded OpenAI with him.
Either one has to proved, and both are desirable even if you already proved one. Benevolence you prove by showing you keep paying real costs to help people, that you don’t benefit from except via their future benevolence, and by being predictably good for people with similar circles of concern. Then benevolent people see you as a kindred spirit and want to help you, because you’ll pass on help to others they also care about. Trustworthiness you prove by living up to your promises to people, maintaining commitments, and when you thought you could maintain them but can’t, winding down the commitments and paying recompense (monetary or not). Proving trustworthiness is most effective at proving you’re reliable when done with people you don’t much care about, because it shows you’re not up to lying to them, and so are less likely to be lying to those you call friends. Then people can trust that if they extend trust and credit to you—money, borrowed tools, vouching for you, anything—that you won’t take the money (or etc.) and run.
It’s very dangerous to have neither, and have people notice that. “It’s dangerous to go alone! Take this!” Nope, no one’s giving you anything. Society is not built to be navigated alone; it assumes you have a reasonable level of both types of honor, and wants to filter out people who don’t, to make it easier for those who do. “You made a contract with me, didn’t give to my demands, therefore die!” is a madman’s move, because now everyone who only wants to make contracts with reasonably honorable people knows you’re not one. (Someone very benevolent might pull it off. Benevolent, Pete Hegseth and the USG are not.)
And here’s the kicker: You are coming off as in that category too. Someone extremely untrustworthy, which you appear to proudly be, may still be benevolent. But if you’re sufficiently sloppy in your thinking you can convince yourself lots of selfish things are benevolent. (I could point to SBF, except that I think he was probably never particularly benevolent either.) And your thinking here’s pretty darn sloppy! So I don’t particularly trust that you’ll notice if you’re actually not benevolent, and I certainly don’t trust that you’ll admit it if you aren’t.
Certainly, and I think a sane civilization would throw everyone in jail who has been selling out humanity in this way. But the OP seems to be muddling right and wrong. It reads to me as though it’s meant as general advice for people who think they’re good people too, and I object to this being called good.
Seems good to make the distinction between pen-down / work-to-rule which Lorxus mentions below, and “corporate sabotage” actions that are dishonest or worse: inserting backdoors into code, getting competent people fired, and dumping metal shavings in the lubricant.
You can do the former without outright lying, and it’s probably justified if your employer is evil. Few people are actually aligned with maximizing the profits of their employer. Going farther than this is IMO dubious, at least in this case.
One wrinkle is that the CIA Simple Sabotage Field Manual recommends lots of actions for white-collar employees that don’t require outright lies and are presumably highly effective at grinding the company to a halt. My guess is these should be permitted.
OpenAI leadership says vibe coding is fine, so why review AI code? (You can pretend to spend time on it if you want tho.)
Are you annoyed by unnecessary meetings? Why? Just relax.
Unplugged cable somewhere? Water leak? Not your job.
Lots of bad programmers have succeeded by spending their time on office politics instead. Office politics is a valuable skill! You should get some practice with it!
The first one struck me as dishonest, but I think it’s fair to read the main thrust as “don’t do work you don’t have to”.
I am not sure that one should disagree with advocating dishonesty and not with the two bigstrategic errors which the post ended up making. As I discussed in the other thread, the ASI race WILL end with a superintelligence, and mankind is to somehow align it to the actual good instead of investors’ whims.
After a potential loss of Anthropic, sandbagging on capabilities research inside OpenAI risks causing GDM or xAI to win the race. If we don’t believe that GDM’s strategy of alignment is working (think of Gemini 3 Pro’s descriptions by Zvi) or if we don’t trust them to create the utopia, then sandbagging on capabilities of OpenAI’s models is likely a bad idea. Attempting to sandbag on alignment research is literally one of the worst things one could do, unless one actually researches ways to secretly align the ASI to a different target, thus preventing OpenAI from creating a dystopia (which also requires proof, since OpenAI’s stated mission is not dystopian, and one could align the AI to literally follow the stated mission).
As for aligning the ASI to a different target, one would have to come up with a scheme which incentizes the ASI to have goals necessary for the schemer and not for OpenAI. These avenues of attack fall into goal types 2,4,6 of the AI-2027 goals forecast, but require the schemer, at the very least, to carefully design the training environment.
I think it would be good if it felt to me like this post was very aware of the lines around honesty, why it was so important, and made a case for this being an exception. Like, corporate sabotage is potentially valid in some cases, but doing it to anyone you think it unethical is far too low a bar, and that equilibrium is just great societal dysfunction.
Just to be clear, quitting a place because you disagree with the direction it has taken is a far lower bar than staying there and dishonorably trying to screw things up.
Uh, because I’m not a person who screws people over whenever it’s convenient to me. (Strong-downvoted.)
Screwing OpenAI over is not the same as screwing people over!
OpenAI slowing down is in everyone’s interests, including OpenAI workers.
This reads to me as obvious self-deception. Did you not make an agreement with the staffer who hired you that you would work in OpenAI’s interests while there? Do you not each day set the implicit expectation with your colleagues that you’re on the same team?
I reflected years ago that all contracts come with a hidden, secret clause, which is that “I am not trying to screw you over with this agreement”. You can be the sort of player who doesn’t have that, but this means I don’t want to make most deals with you, because I would have to put I so much extra work to make sure you’re not screwing me over.
Telling someone you’re secretly screwing them over for their own good… is not an honest or honorable way of interfacing with someone, and should not be the norm, including for people who you have severe disagreements with.
When hired by an employer, we agree to do certain work in exchange for compensation, not to optimize for the employer’s interests or what the CEO thinks the employer’s interests are. The implicit expectation with my colleagues is that I’m on their team, not necessarily the company’s. I work in my employer’s interests because I care about maximizing impact, because I take pride in my work, and because I explicitly told my manager I would finish a certain project this week.
In my view the implicit expectation you have of people by default is fairly weak, and signing a contract doesn’t change this much. In fact, the point of a contract is to make obligations explicit so we don’t have to rely on implicit trust.
Actually it’s common for great companies to have visions that the people believe in that they make part of the hiring and onboarding process, and explicitly label and talk through (e.g. SpaceX’s “we’re going to Mars” or Stripe’s “Increase the GDP of the internet”). I think this is good, and I strongly expect that this is part of the culture Altman has set at OpenAI, so I expect it is much more of an implicit agreement there than it is if you (say) work at a restaurant as a waiter.
There are many equilibriums about how much people expect others to believe in the company that they work at. I guess I am coming at this from a culture where people work at a place because they believe in it, and I think this is a better equilibrium.
It’s something of an empirical question how good the existing companies are and how feasible it is to only work at places where you believe you’re improving the world. But it does seem to me that, if you’re at OpenAI but think it’s harmful to the world, you can just leave and make decent money elsewhere, I don’t think anyone is particularly trapped at the job there.
I guess I see these visions more as things companies try to filter for, inculcate, and perhaps require of executives, rather than ideologies that a rank-and-file engineer is ethically required to adopt. Maybe Lightcone and SpaceX are exceptions, but employees at most companies have a variety of reasons for working there. I’d guess the most common motivation for AI engineers is money. Is it dishonorable for a cracked IC at OpenAI to take a promotion to manager where they’re less effective?
Ok, what if they are motivated by OpenAI’s stated mission: “to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity”? It doesn’t say you should defer to Sam Altman and act as if endowing GPT-5.5 with the capability to spy on Americans benefits all of humanity. While I don’t agree with everything in the OP, it seems perfectly reasonable for an OpenAI employee who wants to benefit all of humanity to take protest actions, including slacking off at work and focusing on office politics if this is better than quitting. Why not just leave? Well, you could become a whistleblower, or the office politics could pay off and let you influence OpenAI for the better.
OpenAI leadership broke that implicit contract first. It was originally supposed to be a philanthropic thing for the benefit of humanity. It was supposed to be “open”. Then it became for-profit, now it’s going to work on killer robots for the military. To whatever extent there’s an implicit contract like you describe, it would also apply to the work that people previously did for it under false pretenses!
That seems like a legit point! It’s just very important to include information like that in your post when you’re proposing corporate sabotage. In political and murky domains where there’s so much noise and adversarial spreading of false narratives, it’s really scary to see posts that just say “I’m sure we all agree <thing> is bad, now let us do underhanded and dishonorable things to attack <thing>”. You can write that independent of whether something is good or bad, and instead write it whenever something is unpopular. It’s both more truth-tracking and less scary to see someone write “I encourage you to do underhanded things to <thing> because of <list of underhanded things that we know they did>.”
You open the post with a list of things that, while bad, are at best reason to quit and protest the company, not reasons to be dishonorable. This section, as far as I can tell, was about as much as you spent actually justifying corporate sabotage:
This jump to ‘fascism’ is just cheap. Are you aware that Altman has repeatedly and publicly stated:
Insofar as him endorsing the government’s threat against a private company is the ‘fascism’ you’re accusing, he has spoken out against it. You may wish to make some argument about why these words are not representative of the actions he will take, but instead you decided it was fine to label someone ‘Actual Fascism’ with capitals. Please hold yourself to higher standards than this.
The biggest problem that I have with the post is that quitting your job is considered a bigger deal than giving up on being an honorable person. Seems very far out from what I consider good behavior. I myself have gone and protested outside of OpenAI due to them racing to develop AGI while being well aware that this poses an extinction-level threat to humanity, so I know is quite possible to oppose a company without acting dishonorably in the process. If you work at OpenAI and no longer believe in the ethics of the company, you can just do the decent thing and quit.
I’m sure it’s possible to write a better version of this post. I hope someone does. Believe it or not, my specialty is engineering, not rhetoric.
My assessment of Sam Altman is that he’s a very good actor, very untrustworthy, and a nihilistic power-seeker who cares very little about benefit or harm to humanity as a whole. I agree that this post alone is only weak support for that assessment. A proper “compendium of reasons not to trust Sam Altman” would probably end up being a considerably longer post.
Doing a job that harms people because you get paid is...also screwing over people because it’s convenient to you.
No, it’s screwing people over because you’ve committed to doing something, and that something screws over people. If there was no honorable way out, that would be a difficult moral question; benevolence vs. trustworthiness. But there’s an easy way out: employment is at will and you can simply leave.
There are two distinct desirable properties to have, ethically. (Probably more than two, but two that I can point to here.) One is benevolence: to do, and seek to do, things that are good for people; people in your circle of concern, typically, but that could be everyone who will ever live or just your family or in between. It’s also benevolent to expand your circle of concern on purpose. The other, trustworthiness, is to deal fairly with people when benevolence doesn’t require it. And you might say ‘all humans alive are in my circle, benevolence covers everyone the second one doesn’t matter.’ But this is not correct, because most of those people don’t trust that they are within your circle. They shouldn’t! You have not given them reason to. It’s very easy to say you care about all humanity—observe the case of 2015!Sam Altman! He even looked like he was paying costs for the belief! - and very hard to prove it—again, see how he did not, at all, care about it, and screwed over everyone who founded OpenAI with him.
Either one has to proved, and both are desirable even if you already proved one. Benevolence you prove by showing you keep paying real costs to help people, that you don’t benefit from except via their future benevolence, and by being predictably good for people with similar circles of concern. Then benevolent people see you as a kindred spirit and want to help you, because you’ll pass on help to others they also care about. Trustworthiness you prove by living up to your promises to people, maintaining commitments, and when you thought you could maintain them but can’t, winding down the commitments and paying recompense (monetary or not). Proving trustworthiness is most effective at proving you’re reliable when done with people you don’t much care about, because it shows you’re not up to lying to them, and so are less likely to be lying to those you call friends. Then people can trust that if they extend trust and credit to you—money, borrowed tools, vouching for you, anything—that you won’t take the money (or etc.) and run.
It’s very dangerous to have neither, and have people notice that. “It’s dangerous to go alone!
Take this!” Nope, no one’s giving you anything. Society is not built to be navigated alone; it assumes you have a reasonable level of both types of honor, and wants to filter out people who don’t, to make it easier for those who do. “You made a contract with me, didn’t give to my demands, therefore die!” is a madman’s move, because now everyone who only wants to make contracts with reasonably honorable people knows you’re not one. (Someone very benevolent might pull it off. Benevolent, Pete Hegseth and the USG are not.)And here’s the kicker: You are coming off as in that category too. Someone extremely untrustworthy, which you appear to proudly be, may still be benevolent. But if you’re sufficiently sloppy in your thinking you can convince yourself lots of selfish things are benevolent. (I could point to SBF, except that I think he was probably never particularly benevolent either.) And your thinking here’s pretty darn sloppy! So I don’t particularly trust that you’ll notice if you’re actually not benevolent, and I certainly don’t trust that you’ll admit it if you aren’t.
Certainly, and I think a sane civilization would throw everyone in jail who has been selling out humanity in this way. But the OP seems to be muddling right and wrong. It reads to me as though it’s meant as general advice for people who think they’re good people too, and I object to this being called good.
Your ethical framework here doesn’t seem consistent to me, but maybe you can explain how it works.
Seems good to make the distinction between pen-down / work-to-rule which Lorxus mentions below, and “corporate sabotage” actions that are dishonest or worse: inserting backdoors into code, getting competent people fired, and dumping metal shavings in the lubricant.
You can do the former without outright lying, and it’s probably justified if your employer is evil. Few people are actually aligned with maximizing the profits of their employer. Going farther than this is IMO dubious, at least in this case.
One wrinkle is that the CIA Simple Sabotage Field Manual recommends lots of actions for white-collar employees that don’t require outright lies and are presumably highly effective at grinding the company to a halt. My guess is these should be permitted.
I may be over-reading into the list in the OP.
The first one struck me as dishonest, but I think it’s fair to read the main thrust as “don’t do work you don’t have to”.
I am not sure that one should disagree with advocating dishonesty and not with the two big strategic errors which the post ended up making. As I discussed in the other thread, the ASI race WILL end with a superintelligence, and mankind is to somehow align it to the actual good instead of investors’ whims.
After a potential loss of Anthropic, sandbagging on capabilities research inside OpenAI risks causing GDM or xAI to win the race. If we don’t believe that GDM’s strategy of alignment is working (think of Gemini 3 Pro’s descriptions by Zvi) or if we don’t trust them to create the utopia, then sandbagging on capabilities of OpenAI’s models is likely a bad idea. Attempting to sandbag on alignment research is literally one of the worst things one could do, unless one actually researches ways to secretly align the ASI to a different target, thus preventing OpenAI from creating a dystopia (which also requires proof, since OpenAI’s stated mission is not dystopian, and one could align the AI to literally follow the stated mission).
As for aligning the ASI to a different target, one would have to come up with a scheme which incentizes the ASI to have goals necessary for the schemer and not for OpenAI. These avenues of attack fall into goal types 2,4,6 of the AI-2027 goals forecast, but require the schemer, at the very least, to carefully design the training environment.