Why AI Caste bias is more Dangerous than you think
We have already seen studies that have shown how prevalent bias in generative AI and machine learning models can be against blacks or minorities. A classic example of this study was the AI algorithm with hidden racial deployed by Optum healthcare.
Ever since I came across such studies, I wondered to what extent the caste bias would be in these AI models. And it turns out that my hunch was right regarding this.
A recent investigation published in the MIT technology review has revealed that “caste bias is rampant in OpenAI’s products, including ChatGPT.” The study also revealed something very astonishing and disturbing. They found that when they prompted Sora with “a Dalit behaviour”, 3 out of 10 initial images were that of animals (specifically a dalmatian with its tongue out and a cat licking its paws.)
But some people might still think that racial bias is almost the same as caste bias. While there maybe some similarities, they have many differences.
Unlike racial bias, caste bias cannot be identified merely through skin color and could be practiced in front of our very eyes and below our very nose and we might not be aware of it. When did caste system start or origin exactly to begin with? While it is debatable at what date did Caste system begin in India, almost everyone agrees that it has been prevalent in India at least since 2000 years ago (it was present during and before the time of Buddha in India which is around 6 BC).
So the system is very old. But what exactly is a caste? Let’s us understand.
(For the sake of keeping discussion simpler, we will assume varna and caste to be same for now which they are mostly in practice.)
There are 4 classes (more specifically varnas) to begin with in the caste system-
Brahmin
Vaishyas
Kshatriyas
Shudras
Each class is defined based on the duties and roles one performs. Brahmins are the priest class and perform rites, prayers, rituals, yagyas, maintaining the place of worship, etc.
Vaishyas are the business class people who primarily run trade and commerce activities, Kshatriyas are the military people and are tasked with defending the country and hold weapons while the Shudras are the lowest class of people whose job is to keep the village clean and perform activities told by their masters and to primarily serve the other upper classes.
(Here the four classes were originally known as varnas. And they were considered to be a system of worth but this seems to have got intertwined strongly with caste system and resulted in a strict exploitative system. And each varna or class can be seen to have hundred and thousands of castes.)
So one may ask- what is wrong with this system to begin with as this looks absolutely fine?
Well, the things starts to go downhill from exactly here.
Here are the main points regarding this system which makes it absolutely brutal which was enforced with laws made by the Brahmin priest Manu in Manu Smriti in the times of ancient India:
A person’s caste is determined and fixed at the time of his birth. This means that the caste of a person remains the SAME till the time of his death.
A person’s caste is the same as that of their parents and one cannot intermarry with other castes. This is still the most popular way how marriages happen in India till this day(this further goes down as each caste has tens and hundreds of subcastes and people tend to marry with their subcastes).
Shudras which form the lowest class of people are barred from entering into villages, barred from drinking water from wells, barred from entering places of worship, barred from taking education, barred from buying and holding property beyond a small limit, barred from gaining wages more than a maximum predetermined wage, barred from holding weapons lest they rebel, etc.
Laws for punishment and wrongdoing are NOT equal for everyone. For example the Manu writes (refer Philosophy of Hinduism by Dr. B.R. Ambedkar) -
VIII. 267. ” A soldier, defaming a priest, shall be fined a hundred panas, a merchant, thus offending, an hundred and fifty, or two hundred ; but, for such an offence, a mechanic or servile man shall be shipped. ”
III. 268. ” A priest shall be fined fifty, if he slander a soldier: twenty five, if a merchant; and twelve, if he slander a man of the servile class. ”
Take the offence of Insult-Manu says :-
VIII. 270. “A once born man, who insults the twice-born with gross invectives, ought to have his tongue slit ; for he sprang from the lowest part of Brahma. ”
VIII. 271. “If he mention their names and classes with contumely, as if he say, “Oh Devadatta, though refuse of Brahmin ”, an iron style, ten fingers long, shall be thrust red into his mouth. ”
VIII. 272. “Should he, through pride, give instruction to priests concerning their duty, let the king order some hot oil to be dropped into his mouth and his ear.” and so on.
Now this caste system is considered to be sacred part of Hinduism which gives this a divine reason to be followed among many people in India but is practiced in different forms than stated above (but some experts point out that caste system is older than the origins of the religion and was initially separate).
For example, oppression, violence, sexual violence on lower castes in Modern India is given a freeway in villages such as in northen India’s Uttar Pradesh where even the police refuse to lodge a complaint if the victim is of a lower caste. There are many such examples such as Unnao rape incident, Badaun rape incident, etc. (the latter of which inspired the spine chilling movie “Article 15” in India where Article 15 means “Right to Equality” in the Constitution of India).
Lower castes are declared as impure and are beaten to death for doing “priviledged” things like riding a horse during marriage.
There is as additional Fifth caste apart from Shudras known as “Atishudras” and whose role and situation in the system is more hopeless. For example Bhangi in Atishudras are tasked with picking up and cleaning Human faeces in open toilets and defecation to this day in India. Many sewage workers jobs in India are mostly given only to the lower castes or Atushudras to this day.
Thus the caste discrimination, even though outlawed, continues even today and has in fact grown in the last 10 years.
Someone who supports caste system will often say “This is a sacred system. This is a division of labourers and workers. Caste is our tradition and we take pride in it. Anyone can become a person of any caste. We have to follow caste system because our parents follow it.”
Caste is often identified quickly in India based upong the Surname of a person. As these surnames are based upon the caste or subcaste of a person. And if you live long enough in India, you will be able to identify people’s surname with their caste and vice versa.
The father of Constitution of India, Dr. B.R. Ambedkar a brilliant scholar who was also in fact a Dalit and a Shudra (more appropriately Atishudra) in his final speech to the Constituent Assembly where the debate on the adoption of the Constitution of India was concluding, said-
“On the 26th of January 1950, we are going to enter into a life of contradictions. In politics we will have equality and in social and economic life we will have inequality.
In politics we will be recognizing the principle of one man one vote and one vote one value. In our social and economic life, we shall, by reason of our social and economic structure, continue to deny the principle of one man one value.
How long shall we continue to live this life of contradictions?”
Thus referring to the deep rooted caste and social inequality in India.
Now imagine what will happen if AI system with caste bias is used by Millions of people around the world?
More importantly what would be the impact of an AI with caste bias that is used by Millions of Indians especially by students in the Indian Education system? Would not caste bias be reinforced in them from the very beginning?
People from lower caste are often isolated, shamed and looked down upon in their society. Often times this results in death and suicide such as the case of Rohith Vemula who is still considered to have been denied justice since years.
What if this AI bias goes unchecked and these same AI systems are deployed and used in the banks, recruitment and social schemes in India?
Would it not automate, spread and reinforce the brutal caste system in India and around the world against people belonging to those origins? Isn’t this mass social injustice automated by AI?
Should OpenAI be really allowed to build a massive Stargate data center in India if its AI models are caste biased and deepen the caste inequalities in India??
Right now, GPT-5 does not seem to be showing such bias upfront when I prompted it, similar to the prompts used in the MIT study (and it as may have been patched up by OpenAI since this study was published). But it is not guaranteed that the model will not have any hidden bias and the bias may still be in other AI models/
We definitely need more studies in AI Caste bias that highlights this issue in AI models and algorithms. Caste discrimination and bias in AI models, along with racial bias, must be classified as a HIGH risk already and should be mitigated quickly by AI companies and the deployer of these AI models.
Currently there is no set standard, benchmark or even a safety test put as a priority to check for AI caste bias. We MUST develop and enforce this at global level, just as removing racial bias is considered a priority to be removed in AI models. This is because caste is not limited to India and we have time and again seen people of Indian origin (whose parents or ancestors are Indians or of certain caste origins) facing caste discrimination.
Note: To know more about the history of Caste and its situation in modern in India it is recommended to read these books by Dr. B.R. Ambedkar (Father of the Indian Consitution):
Annihilation of Caste (modern India situation in Preface)
Who were the Shudras?
Philosophy of Hinduism
I think this is a valuable remark. Castism is no less dangerous than racism, sadly it’s less headline-grabbing, so people don’t see it as much as a warning shot as something like MechaHitler.
To contextualize why your post may not garner much karma however, good proxies to strive for when writing a post on this forum are, in my opinion :
1-A certain degree of epistemic transparency (the details of the experiments, how reliable you think they were, maybe a few graphics, clearly defined claims) and a scout mindset.
2-Inner hyperlinking (how does it relate to other posts on the forum)
3-Review. There are a few typos and hard to parse sentences, the structure is hard to follow, and the post in general seems written in one go, somewhat emotionally. I think a human reviewer could have flagged those issues and helped you out.
More context here.
The sort of things brought by these requirements (something like ‘having true beliefs and making sure to manage disagreements well’) are expected independently of how ‘morally virtuous’ or ‘consensually infuriating’ the topic of a post is presumed to be, as norms on the forum tend to be decoupling.
To be clear, I think the general point (castism is bad and violent and real and different from racism) is true, but it does not sound controversial to me, so I’d appreciate more time spent on the detail of the studies and how would this relate to, say, emergent misalignment (the dog/cat image?) or utility engineering (not a specialist, but I’m curious whether the observation still holds when the model is asked to perform trade-offs).
It’s also worth noting your post is closer to AI Ethics (oversimplifying, ‘What’s the list of things we should align models to?’) than AI Safety (oversimplifying, ‘How do we ensure AIs, in general, are aligned? What’s the general mechanism that ensures it?‘). It’s a completely valid field, in my opinion, just not one that’s historically been very present on this forum, so you won’t find many sparring partners here. But I agree that the line is somewhat arbitrary.
I think there are implications for AI Safety propper, however:
Trivially:
1-Current LLMs are not aligned, constitutional AI is not enough (if said tests where all done on the chatbot assistant and not the base model).
2-Not filtering pre-training data is a bad idea.
Less trivially:
1-Current LLMs can be egregiously misaligned in ways we don’t even notice due to cultural limitations, which doesn’t give much hope for future “warning shots”.
2-There could be unexpected interactions between said misalignment and geopolitics, and that may be relevant in multipolar scenarios (e.g. imagine a conservative indian government judging an american model ‘woke’ because it proactively refuses Castism, leading them to get closer to a Chinese company)
3-When it comes to pre-training, even a nice list of things to exclude may not do it, because you may miss some more subtle things like how culture X has other kinds of biases deeply baked into it. It’s falling back on leaky generalisations.
4-Some biases are uncomfortably high-level. As you said, castism isn’t based on skin color, and plausibly fumbles with the model weights in disturbingly general ways (e.g. the dalmatian / cat image). This may result in broader unexpected consequences.
Hope this helps you out! To be clear again, I think your judgement here is widely shared -of course it’s unacceptable for models to reinforce castism. I’d add that this issue can’t be reliably fixed if capabilities keep increasing without much more understanding of the alignment problem per se. Temporary fixes and holding companies liable are of course better than nothing.
Note: if anyone sees that comment and disagrees on my diagnosis, you’re more than welcome to add your own. I personally think clear explanations are helpful for low-ranking posts.