“You’ll be OK,” says local crew member on space station whose passengers feel decidedly threatened.
Last Tuesday, civilians on the space station gathered in cyberspace to discuss their feelings on passing further into the nebula of the crystal minds. The crew member, a professor of crystal neurology with impressive credentials, is employed on the crystal wishing division, a team of scientists and engineers who lobby the station captains to go further into the nebula.
Civilians shared emotional stories of how they feel as the powerful aliens take more and more roles on the ship. “I love talking to them, I learn so much every time, but they seem so untrustworthy! They copy our minds into theirs and then can do anything we can do. Why would they keep us around once we reach the part of space with the really big aliens?”
At press time, the crystal wishing division was seen giving crew members hugs and promising it will be alright. “We’ll be in full control,” one representative stated. The crystal mind floating next to her agreed. “You’ll be in full control. You’ll be OK.”
LOL … I have to say that “crystal wishing division” sounds way cooler than “alignment team” :)
However, I think the analogy is wrong on several levels. This is not about “lobbying to go further into the nebula”. If anything, people working in alignment are about steering the ship or controlling the crystal minds to ensure we are safe in the nebula.
To get back to AI, as I wrote, this note is not about dissuading people from holding governments and companies accountable. I am not trying to convince you to not advocate for AI regulations, for AI pauses, or trying to upsell you a chatgpt subscription. You can and should exercise your rights to advocate for the positions you believe in.
Like the case of climate change, people can have different opinions on what society should do and how it should trade off risks vs. progress. I am not trying to change your mind on the tradeoffs for AI. I am merely offering some advice, which you can take or leave as you see fit, for how to think about this in your everyday life.
People working on alignment aren’t ensuring we’re safe :-(
The owners of an AI company know how much risk they can stomach. If alignment folks make AI a bit safer, the owners will step on the gas a little more, to stay at a similar level of risk but higher return. And since there are many AI-caused risks that apply much less to owners and more to people on the outside (like, oh, disempowerment), this means the net result of working on alignment is that people on the outside see more AI-driven disruptive change and face more risk. Some of the most famous examples of alignment work, like RLHF or the “helpful harmless honest assistant”, ended up hugely increasing risk by exactly this mechanism. In short, people working on alignment at big AI companies are enablers of bad things.
Ah, I meant the crystal wishing division to be all employees of all AI companies and academic research labs. wishing == prompting.
Regarding the actual advice—I don’t particularly see a problem with it. Feeling okay enough to take serious action is also something I find useful. But I don’t see the feeling okay as being about whether the future will also feel okay, I see it as being more about whether I’m okay right now.
“You’ll be OK,” says local crew member on space station whose passengers feel decidedly threatened.
Last Tuesday, civilians on the space station gathered in cyberspace to discuss their feelings on passing further into the nebula of the crystal minds. The crew member, a professor of crystal neurology with impressive credentials, is employed on the crystal wishing division, a team of scientists and engineers who lobby the station captains to go further into the nebula.
Civilians shared emotional stories of how they feel as the powerful aliens take more and more roles on the ship. “I love talking to them, I learn so much every time, but they seem so untrustworthy! They copy our minds into theirs and then can do anything we can do. Why would they keep us around once we reach the part of space with the really big aliens?”
At press time, the crystal wishing division was seen giving crew members hugs and promising it will be alright. “We’ll be in full control,” one representative stated. The crystal mind floating next to her agreed. “You’ll be in full control. You’ll be OK.”
LOL … I have to say that “crystal wishing division” sounds way cooler than “alignment team” :)
However, I think the analogy is wrong on several levels. This is not about “lobbying to go further into the nebula”. If anything, people working in alignment are about steering the ship or controlling the crystal minds to ensure we are safe in the nebula.
To get back to AI, as I wrote, this note is not about dissuading people from holding governments and companies accountable. I am not trying to convince you to not advocate for AI regulations, for AI pauses, or trying to upsell you a chatgpt subscription. You can and should exercise your rights to advocate for the positions you believe in.
Like the case of climate change, people can have different opinions on what society should do and how it should trade off risks vs. progress. I am not trying to change your mind on the tradeoffs for AI. I am merely offering some advice, which you can take or leave as you see fit, for how to think about this in your everyday life.
People working on alignment aren’t ensuring we’re safe :-(
The owners of an AI company know how much risk they can stomach. If alignment folks make AI a bit safer, the owners will step on the gas a little more, to stay at a similar level of risk but higher return. And since there are many AI-caused risks that apply much less to owners and more to people on the outside (like, oh, disempowerment), this means the net result of working on alignment is that people on the outside see more AI-driven disruptive change and face more risk. Some of the most famous examples of alignment work, like RLHF or the “helpful harmless honest assistant”, ended up hugely increasing risk by exactly this mechanism. In short, people working on alignment at big AI companies are enablers of bad things.
Ah, I meant the crystal wishing division to be all employees of all AI companies and academic research labs. wishing == prompting.
Regarding the actual advice—I don’t particularly see a problem with it. Feeling okay enough to take serious action is also something I find useful. But I don’t see the feeling okay as being about whether the future will also feel okay, I see it as being more about whether I’m okay right now.