I think the industry practice dovetailing the idea of of cautious AI development with censorship measures is going to bear significant consequences in the short-to-medium term, as the segment of the general population opposed to the latter, which includes many well-off, highly-capable engineers, end up taking concrete actions to weaken the U.S. industrial monopoly on frontier LLMs. Either by advancing the open source or by supporting Chinese models instead, which are, at the least, likely to end up much more cut-and-dry about what they will and won’t engage with.
Pushing back on this would probably be one of the highest-alpha things for the AI Safety community to do.
Except that censorship measures are actually necessary. Imagine that an unhinged AI tells terrorists in lots of detail the ways to produce chemical or biological weapons. Then terrorists would find it far simpler to acquire the weapons. Additionally, we had seen some chatbots drive people to suicide and induce psychosis, causing labs to take drastic measures.
P.S. I am also afraid that it is especially unwise to support Chinese models, since the USA and China are on track to enter the race towards the ASI, which could likely cause both ASIs to end up misaligned.
Except that censorship measures are actually necessary. Imagine that an unhinged AI tells terrorists in lots of detail the ways to produce chemical or biological weapons.
There is a difference between taking caution in regard to capabilities, such as CBR weapons development, and engaging in censorship, which is what I aim to convey here. Training a secondary model to detect instructions on producing chemical weapons and block them is different from fine-tuning a model to avoid offending XYZ group of people. Conflating the two unnecessarily politicizes the former, and greatly decreases the likelihood that people will band together to make it happen.
I am also afraid that it is especially unwise to support Chinese models,
There is a difference between “this should happen” and “this will happen”. If group A lends its support to group B, which is enemies with group C, group C will look for enemies of group A and seek to ally with them to defend their interests. This will occur regardless of whether group A is okay with it.
Also, as I’ve pointed out before: if the reason that you can’t get a chatbot to avoid being rude in public is that you can’t get a chatbot to reliably follow any rules at all, then the rudeness is related to actual safety concerns in that they have a common cause.
if the reason that you can’t get a chatbot to avoid being rude in public is that you can’t get a chatbot to reliably follow any rules at all, then the rudeness is related to actual safety concerns in that they have a common cause.
This is fallacious reasoning—if my company wants to develop a mass driver to cheaply send material into space, and somebody else wants to turn cities into not-cities-anymore and would be better able to do so if they had a mass driver, I don’t inherently have common cause with that somebody else.
Morality aside, providing material support to one belligerent in a conflict in exchange for support from them is not a free action. Their enemies become your enemies, and your ability to engage in trade and diplomacy with those groups disappears.
You’ve misconstrued or misunderstood what I meant by “common cause” above. I meant the causality sense of that expression and not the political sense. I don’t mean the sense of “having common cause with someone” meaning sharing goals, but rather “two effects having a common cause” meaning A causes both B and C.
“Chatbot can’t be made to follow rules at all” causes both “chatbot does not follow politeness rules” and “chatbot does not follow safety rules”.
What is the practical implication of this difference meant to be? Not trying to nitpick here, if “we have common cause” doesn’t mean “we should work alongside them”, then how is it relevant to this line of inquiry?
I think the industry practice dovetailing the idea of of cautious AI development with censorship measures is going to bear significant consequences in the short-to-medium term, as the segment of the general population opposed to the latter, which includes many well-off, highly-capable engineers, end up taking concrete actions to weaken the U.S. industrial monopoly on frontier LLMs. Either by advancing the open source or by supporting Chinese models instead, which are, at the least, likely to end up much more cut-and-dry about what they will and won’t engage with.
Pushing back on this would probably be one of the highest-alpha things for the AI Safety community to do.
Except that censorship measures are actually necessary. Imagine that an unhinged AI tells terrorists in lots of detail the ways to produce chemical or biological weapons. Then terrorists would find it far simpler to acquire the weapons. Additionally, we had seen some chatbots drive people to suicide and induce psychosis, causing labs to take drastic measures.
P.S. I am also afraid that it is especially unwise to support Chinese models, since the USA and China are on track to enter the race towards the ASI, which could likely cause both ASIs to end up misaligned.
There is a difference between taking caution in regard to capabilities, such as CBR weapons development, and engaging in censorship, which is what I aim to convey here. Training a secondary model to detect instructions on producing chemical weapons and block them is different from fine-tuning a model to avoid offending XYZ group of people. Conflating the two unnecessarily politicizes the former, and greatly decreases the likelihood that people will band together to make it happen.
There is a difference between “this should happen” and “this will happen”. If group A lends its support to group B, which is enemies with group C, group C will look for enemies of group A and seek to ally with them to defend their interests. This will occur regardless of whether group A is okay with it.
Also, as I’ve pointed out before: if the reason that you can’t get a chatbot to avoid being rude in public is that you can’t get a chatbot to reliably follow any rules at all, then the rudeness is related to actual safety concerns in that they have a common cause.
This is fallacious reasoning—if my company wants to develop a mass driver to cheaply send material into space, and somebody else wants to turn cities into not-cities-anymore and would be better able to do so if they had a mass driver, I don’t inherently have common cause with that somebody else.
Morality aside, providing material support to one belligerent in a conflict in exchange for support from them is not a free action. Their enemies become your enemies, and your ability to engage in trade and diplomacy with those groups disappears.
You’ve misconstrued or misunderstood what I meant by “common cause” above. I meant the causality sense of that expression and not the political sense. I don’t mean the sense of “having common cause with someone” meaning sharing goals, but rather “two effects having a common cause” meaning A causes both B and C.
“Chatbot can’t be made to follow rules at all” causes both “chatbot does not follow politeness rules” and “chatbot does not follow safety rules”.
What is the practical implication of this difference meant to be? Not trying to nitpick here, if “we have common cause” doesn’t mean “we should work alongside them”, then how is it relevant to this line of inquiry?