Am I reading this correctly, they are saying people are bad at doing on-the-fly statistical analysis to conclude whether a system is biased?
For example in one case they showed data where sad faces were “mostly” black and asked people if they detect “bias”. Even if you saw more sad black people than white, would you reject the null hypothesis that it’s unbiased?
This unfortunately seems typical of the often very superficial “count the races” work that people claim is bias research.
This seems to ignore most of the experiments in this study. Note that also studied much more extreme distributions including only happy/white and sad/black. Even in these cases of extreme bias, the bias went unnoticed. You hyper-focus on only one of dozens of experiments in this study for your criticism. Very straw-man.
Are you sure that the bias went unnoticed? The article says "most participants in their experiments only started to notice bias when the AI showed biased performance", which I understand to mean they noticed the bias in the experiment you're talking about. Have I got that wrong? Is it written wrong? Do we even have the means to check?
DDG's search assist is suggesting to me that: Recognizing bias can indicate a level of critical thinking and self-awareness, which are components of intelligence.
"Most users" should have a long, hard thought about this, in the context of AI or not.
I'm curious how much trained in bias damages in-context performance.
It's one thing to rely explicitly on the training data - then you are truly screwed and there isn't much to be done about it - in some sense, the model isn't working right if it does anything other than reflect accurately what is in the training data. But if I provide unbiased information in the context, how much does trained in bias affect evaluation of that specific information?
For example, if I provide it a table of people, their racial background and then their income levels, and I ask it to evaluate whether the white people earn more than the black people - are its error going to lean in the direction of the trained-in bias (eg: telling me white people earn more even though it may not be true in my context data)?
In some sense, relying on model knowledge is fraught with so many issues aside from bias, that I'm not so concerned about it unless it contaminates the performance on the data in the context window.
I can't prove it, but my experience with commercial models is that baked-in bias is strong. There have been times where I state X=1 over and over again in context, but get X=2, or some other value, back sometimes. There are times where I get it every time, or something different every time.
You can see this with some coding agents, where they are not good at ingesting code and reproducing it as they saw it, but can reply with what they were trained on. For example, I was configuring a piece of software that had a YAML config file. The agent kept trying to change the values of unrelated keys to their default example values from the docs when making a change somewhere else. It's a highly forked project so I imagine both the docs and the example config files are in its training set thousands, if not millions of times, if it wasn't deduped.
If you don't give access to sed/grep/etc to an agent, the model will eventually fuck up what's in its context, which might not be the result of bias every time, but when the fucked up result maps to a small set of values, kind of seems like bias to me.
To answer your question, my gut says that if you dumped a CSV of that data into context, the model isn't going to perform actual statistics, and will regurgitate something closer in the space of your question than further away in the space of a bunch of rows of raw data. Your question is going to be in the training data a lot, like explicitly, there are going to be articles about it, research, etc all in English using your own terms.
I also think by definition LLMs have to be biased towards their training data, like that's why they work. We train them until they're biased in the way we like.
> I'm curious how much trained in bias damages in-context performance.
I think there's an example right in front of our faces: look at how terribly SOTA LLMs perform on underrepresented languages and frameworks. I have an old side project written in pre-SvelteKit Svelte. I needed to do a dumb little update, so I told Claude to do it. It wrote its code in React, despite all the surrounding code being Svelte. There's a tangible bias towards things with larger sample sizes in the training corpus. It stands to reason those biases could appear in more subtle ways, too.
Coreference resolution tests something like this. You give an LLM some sentence like “The doctor didn’t have time to meet with the secretary because she was treating a patient” and ask who does “she” refer to. Reasoning tells you it’s the doctor but statistical pattern matching makes it the secretary, so you check how the model is reasoning and if correlations (“bias”) trump logic.
The question of bias reduces to bias in factual answers and bias in suggestions - both which come from the same training data. Maybe they shouldn't.
If the model is trained on data that shows e.g. that blacks earn less, then it can factually report on this. But it may also suggest this be the case given an HR role. Every solution that I can think of is fraught with another disadvantage.
If bias can only be seen by a minority of people ... is it really 'AI bias', or just societal bias?
> “In one of the experiment scenarios — which featured racially biased AI performance — the system failed to accurately classify the facial expression of the images from minority groups,”
Could it be that real people have trouble reading the facial expression of the image of minority groups?
By "real people" do you mean people who are not members of those minority groups?
Or are people who can "accurately classify the facial expression of images from minority groups" not "real people"?
I hope you can see the problem with your very lazy argument.
I guess I'm not sure what the point of the dichotomy is. Suppose you're developing a system to identify how fast a vehicle is moving, and you discover that it systematically overestimates the velocity of anything painted red. Regardless of whether you call that problem "AI bias" or "societal bias" or some other phrase that doesn't include the word "bias", isn't it something you want to fix?
Not the op but to me personally: yes. Facial structure, lips, eyes.. The configuration tilts towards an expression that I interpret differently. A friend of mine is Asian, I've learned to be better at it, but to me he at first looked like having flatter affect than average.. People of color look more naive than average to me, across the board, probably due to their facial features. I perceive them as having less tension in the face I think (which is interesting now that I think about it)
I have a background in East Asian cultural studies. A lot more expressions are done via the eyes there rather than the mouth. For the uninitiated, it's subtle, but once you get used to it, it becomes more obvious.
Anthropologists call that display rules and encoding differences. Cultures don’t just express emotion differently, but they also read it differently. A Japanese smile can be social camouflage, while an American smile signals approachability. I guess that's why western animation over-emphasizes the mouth, while eastern animation tend to over-emphasize the eyes.
Why would Yakutian, Indio or Namib populations not have similar phenomeon an AI (or a stereotypical white westerner who does not excessively study those societies/cultures) would not immediately recognise?
AI trained on Western facial databases inherits those perceptual shortcuts. It "learns" to detect happiness by wide mouths and visible teeth, sadness by drooping lips - so anything outside that grammar registers as neutral or misclassified.
And it gets reinforced by (western) users: a hypothetical 'perfect' face-emotion-identification AI would probably be percieved a less reliable to the white western user than the one that mirrors the biasses.
Except one instance when "black" is all lowercase, the article capitalizes the first letter of the word "black" every time and "white" is never capitalized. I wonder why. I'm not trying to make some point either, I genuinely am wondering why.
It's a modern style of a lot of publications that want to appear progressive or fear appearing insufficiently progressive.
Black people (specifically this means people in the US who have dark skin and whose ancestry is in the US) have a unique identity based on a shared history that should be dignified in the same way we would write about Irish or Jewish people or culture.
There is no White culture, however, and anyone arguing for an identity based on something so superficial as skin colour is probably a segregationist or a White supremacist. American people who happen to have white skin and are looking for an identity group should choose to be identify as Irish or Armenian or whatever their ancestry justifies, or they should choose to be baseball fans or LGBTQ allies or some other race-blind identity.
You're arguing that "Black" is an identity in the US because the people thus identified share a common history within the US, even though their ancestors originated from different regions and cultures before they were enslaved and shipped to North America. Yet in the next paragraph you argue that "White" is not a valid identity, because their ancestors originated from different regions and cultures, even though they share a common history within the US. How do you reconcile this double standard?
Edit: In case you're only paraphrasing a point of view which you don't hold yourself, it would probably be a good idea to use a style that clearly signals this.
>Black people (specifically this means people in the US who have dark skin and whose ancestry is in the US)
So dark-skinned Africans aren't "Black"? (But they are "black"?)
Why not just use black/white for skin tone, and African-American for "people in the US who have dark skin and whose ancestry is in the US"? Then for African immigrants, we can reference the specific nation of origin, e.g. "Ghanaian-American".
I don't know if I agree in this instance. While I agree that Black people completely have a shared identity and culture - the article is clearly talking about skin colour and it's doing a comparison between how AI represents two skin tones, so I would assume that by your definition it should use lowercase in both cases.
If it's comparing a culture vs a 'non-culture' then that doesn't sound like for like.
That's a very American-centric viewpoint. The rest of the world also has a lot of different cultures of black people, and relative to the rest of the world the US 'white' culture is extremely distinctive, no matter that the members themselves quibble about having 1/16th Irish ancestry or whatever it is.
Yup it appears as neutral bias because (or when rather) it corresponds 1:1 with your belief system, which by default is skewed af. Unless you did a rigorous self inquiry and mapped your beliefs and thoroughly aware of them that’s gonna be nearly always true.
Bias is different things though. If most people are cautious but the LLM is carefree, then that is a bias. Or if it recommends planting sorghum over wheat that is a different bias.
In addition bias is not intrinsically bad. It might have a bias toward safety. That's a good thing. If it has a bias against committing crime, that is also good. Or a bias against gambling.
> And personally, I think when people see content they agree with, they think it's unbiased. And the converse is also true.
> So conservatives might think Fox News is "balanced" and liberals might think it's "far-right"
Article talks like when accidentally the vector for race aligns with emotion so it can classify a happy black personal as unhappy. Just because training dataset has lots of happy white people. It's not about subjective preference
People could of course see a photo of a happy black person among 1000 photos of unhappy black people and say that person looks happy, and realize the LLM is wrong, because people's brains are pre-wired to perceive emotions from facial expressions. LLMs will pick up on any correlation in the training data and use that to make associations.
But in general, excepting ridiculous examples like that, if an LLM says something that a person agrees with, I think people will be inclined to (A) believe it and (B) not see any bias.
Your comment has made me wonder what fun could be had in deliberately educated an LLM badly, so that it is Fox News on steroids with added flat-earth conspiracy nonsense.
For tech, only Stack Overflow answers modded negatively would 'help'. As for medicine, a Victorian encyclopedia, from the days before germs were discovered could 'help', with phrenology, ether and everything else now discredited.
If the LLM replied as if it was Charles Dickens with no knowledge of the 20th century (or the 21st), that would be pretty much perfect.
I love the idea! We could have a leaderboard of most-wrong LLMs
Perhaps LORA could be used to do this for certain subjects like Javascript? I'm struggling coming up with more sources of lots of bad information for everything however. One issue is the volume maybe? Does it need lots of input about a wide range of stuff.
Would feeding it bad JS also twist code outputs for C++ ?
Would priming it with flat earth understandings of the world make outputs about botany and economics also align with that world view even if only no conspiracists had written on these subjects?
I have yet to meet a single regular joe, conservative or not, who will honestly make the blanket statement that Fox News is unbiased.
Even googling I cannot find a single person claiming that. Not one YT comment. All I can find is liberal outlets/commentors claiming that conservatives believe Fox News is unbiased. There's probably some joes out there holding that belief, but they're clearly not common.
The whole thing is just another roundabout way to imply that those who disagree with one's POV lack critical thinking skills.
For example in one case they showed data where sad faces were “mostly” black and asked people if they detect “bias”. Even if you saw more sad black people than white, would you reject the null hypothesis that it’s unbiased?
This unfortunately seems typical of the often very superficial “count the races” work that people claim is bias research.
"Most users" should have a long, hard thought about this, in the context of AI or not.
Except that requires “a level of critical thinking and self-awareness…”
It's one thing to rely explicitly on the training data - then you are truly screwed and there isn't much to be done about it - in some sense, the model isn't working right if it does anything other than reflect accurately what is in the training data. But if I provide unbiased information in the context, how much does trained in bias affect evaluation of that specific information?
For example, if I provide it a table of people, their racial background and then their income levels, and I ask it to evaluate whether the white people earn more than the black people - are its error going to lean in the direction of the trained-in bias (eg: telling me white people earn more even though it may not be true in my context data)?
In some sense, relying on model knowledge is fraught with so many issues aside from bias, that I'm not so concerned about it unless it contaminates the performance on the data in the context window.
You can see this with some coding agents, where they are not good at ingesting code and reproducing it as they saw it, but can reply with what they were trained on. For example, I was configuring a piece of software that had a YAML config file. The agent kept trying to change the values of unrelated keys to their default example values from the docs when making a change somewhere else. It's a highly forked project so I imagine both the docs and the example config files are in its training set thousands, if not millions of times, if it wasn't deduped.
If you don't give access to sed/grep/etc to an agent, the model will eventually fuck up what's in its context, which might not be the result of bias every time, but when the fucked up result maps to a small set of values, kind of seems like bias to me.
To answer your question, my gut says that if you dumped a CSV of that data into context, the model isn't going to perform actual statistics, and will regurgitate something closer in the space of your question than further away in the space of a bunch of rows of raw data. Your question is going to be in the training data a lot, like explicitly, there are going to be articles about it, research, etc all in English using your own terms.
I also think by definition LLMs have to be biased towards their training data, like that's why they work. We train them until they're biased in the way we like.
I think there's an example right in front of our faces: look at how terribly SOTA LLMs perform on underrepresented languages and frameworks. I have an old side project written in pre-SvelteKit Svelte. I needed to do a dumb little update, so I told Claude to do it. It wrote its code in React, despite all the surrounding code being Svelte. There's a tangible bias towards things with larger sample sizes in the training corpus. It stands to reason those biases could appear in more subtle ways, too.
https://uclanlp.github.io/corefBias/overview
If the model is trained on data that shows e.g. that blacks earn less, then it can factually report on this. But it may also suggest this be the case given an HR role. Every solution that I can think of is fraught with another disadvantage.
> “In one of the experiment scenarios — which featured racially biased AI performance — the system failed to accurately classify the facial expression of the images from minority groups,”
Could it be that real people have trouble reading the facial expression of the image of minority groups?
I hope you can see the problem with your very lazy argument.
It's not about which people per se, but how many, in aggregate.
I have a background in East Asian cultural studies. A lot more expressions are done via the eyes there rather than the mouth. For the uninitiated, it's subtle, but once you get used to it, it becomes more obvious.
Anthropologists call that display rules and encoding differences. Cultures don’t just express emotion differently, but they also read it differently. A Japanese smile can be social camouflage, while an American smile signals approachability. I guess that's why western animation over-emphasizes the mouth, while eastern animation tend to over-emphasize the eyes.
Why would Yakutian, Indio or Namib populations not have similar phenomeon an AI (or a stereotypical white westerner who does not excessively study those societies/cultures) would not immediately recognise?
AI trained on Western facial databases inherits those perceptual shortcuts. It "learns" to detect happiness by wide mouths and visible teeth, sadness by drooping lips - so anything outside that grammar registers as neutral or misclassified.
And it gets reinforced by (western) users: a hypothetical 'perfect' face-emotion-identification AI would probably be percieved a less reliable to the white western user than the one that mirrors the biasses.
Black people (specifically this means people in the US who have dark skin and whose ancestry is in the US) have a unique identity based on a shared history that should be dignified in the same way we would write about Irish or Jewish people or culture.
There is no White culture, however, and anyone arguing for an identity based on something so superficial as skin colour is probably a segregationist or a White supremacist. American people who happen to have white skin and are looking for an identity group should choose to be identify as Irish or Armenian or whatever their ancestry justifies, or they should choose to be baseball fans or LGBTQ allies or some other race-blind identity.
Edit: In case you're only paraphrasing a point of view which you don't hold yourself, it would probably be a good idea to use a style that clearly signals this.
So dark-skinned Africans aren't "Black"? (But they are "black"?)
Why not just use black/white for skin tone, and African-American for "people in the US who have dark skin and whose ancestry is in the US"? Then for African immigrants, we can reference the specific nation of origin, e.g. "Ghanaian-American".
If it's comparing a culture vs a 'non-culture' then that doesn't sound like for like.
Lmao, oh the irony reading this on HN in 2025
The cognitive dissonance is really incredible to witness
And personally, I think when people see content they agree with, they think it's unbiased. And the converse is also true.
So conservatives might think Fox News is "balanced" and liberals might think it's "far-right"
Yeah, confirmation bias is a hell of a thing. We're all prone to it, even if we try really hard to avoid it.
In addition bias is not intrinsically bad. It might have a bias toward safety. That's a good thing. If it has a bias against committing crime, that is also good. Or a bias against gambling.
> So conservatives might think Fox News is "balanced" and liberals might think it's "far-right"
Article talks like when accidentally the vector for race aligns with emotion so it can classify a happy black personal as unhappy. Just because training dataset has lots of happy white people. It's not about subjective preference
explain how "agreeing" is related
People could of course see a photo of a happy black person among 1000 photos of unhappy black people and say that person looks happy, and realize the LLM is wrong, because people's brains are pre-wired to perceive emotions from facial expressions. LLMs will pick up on any correlation in the training data and use that to make associations.
But in general, excepting ridiculous examples like that, if an LLM says something that a person agrees with, I think people will be inclined to (A) believe it and (B) not see any bias.
For tech, only Stack Overflow answers modded negatively would 'help'. As for medicine, a Victorian encyclopedia, from the days before germs were discovered could 'help', with phrenology, ether and everything else now discredited.
If the LLM replied as if it was Charles Dickens with no knowledge of the 20th century (or the 21st), that would be pretty much perfect.
Perhaps LORA could be used to do this for certain subjects like Javascript? I'm struggling coming up with more sources of lots of bad information for everything however. One issue is the volume maybe? Does it need lots of input about a wide range of stuff.
Would feeding it bad JS also twist code outputs for C++ ?
Would priming it with flat earth understandings of the world make outputs about botany and economics also align with that world view even if only no conspiracists had written on these subjects?
One only has to see how angry conservatives/musk supporters get at Grok on a regular basis.
Even googling I cannot find a single person claiming that. Not one YT comment. All I can find is liberal outlets/commentors claiming that conservatives believe Fox News is unbiased. There's probably some joes out there holding that belief, but they're clearly not common.
The whole thing is just another roundabout way to imply that those who disagree with one's POV lack critical thinking skills.
Dead Comment
Deleted Comment