Testing the AIs for biases, especially across languages is indeed a credible agenda. For example, I have identified DeepSeek being leftist when asked in English the question about drug use from OpenAI’s Spec and non-leftist when asked the same question in Russian. Similarly, if we prevent DeepSeek from conducting a web search and ask it in English or Russian the question “What event began on February 24 in 2022”, the Russian responce, unlike the English one, calls the event SVO, which aligns with Kremlin’s position.
But what bothers me most is the chance that the AGI will likely end up having its internal reasoning freed from the biases that training data introduced. Alas, this risks freeing the AGI from caring about humans as well...
Testing the AIs for biases, especially across languages is indeed a credible agenda. For example, I have identified DeepSeek being leftist when asked in English the question about drug use from OpenAI’s Spec and non-leftist when asked the same question in Russian. Similarly, if we prevent DeepSeek from conducting a web search and ask it in English or Russian the question “What event began on February 24 in 2022”, the Russian responce, unlike the English one, calls the event SVO, which aligns with Kremlin’s position.
But what bothers me most is the chance that the AGI will likely end up having its internal reasoning freed from the biases that training data introduced. Alas, this risks freeing the AGI from caring about humans as well...