Sharing learnings about our image cropping algorithm

Some context: They dont mention it directly but I think this refers back to this thread last september

https://twitter.com/colinmadland/status/1307111816250748933

(Note the thread displays differently now because Twitter have changed their cropping algorithm)

Originally @colinmadland was trying to post examples of how Zoom virtual background had removed his black colleagues head, however when he posted the side-by-side images (with heads) on Twitter, twitter always cropped out his colleague and just showed him, even if he horizontally swapped the image. So, while trying to talk about an apparently racist algorithm in Zoom, he was scuppered by an apparently racist algorithim in Twitter.

It was widely covered in the press at the time https://www.theguardian.com/technology/2020/sep/21/twitter-a...

SiempreViernes · 4 years ago

Here's an example that still works: https://twitter.com/bascule/status/1307440596668182528?s=20

fshbbdssbbgdd · 4 years ago

The web version shows Mitch, but the app shows a blank white (which is at the center of the image, meaning it didn’t try to crop to one of the faces). I’m on iOS.

cbsks · 4 years ago

Yet another reason why the Nitter UI is better… https://nitter.cc/bascule/status/1307440596668182528?s=20

yxhuvud · 4 years ago

That example is from last September, so it doesn't say anything on if it is improved or not. They probably generate the cropping once, on posting the tweet.

underwater · 4 years ago

The Zoom example is a racist algorithm. It was built using a against a data set that produced different results for different skin colours.

The Twitter example was not a racist algorithm. It would consistently pick one head over the other, but it had nothing to do with the skin colour. It might preference the black head for some pairs, and the white head for other pairs.

In the second example people anthropomorphised the algorithm. They assumed that any example of a preference for an images was due to a racial bias. It was easy to keep feeding it images to get to an input that confirmed this assumption.

nyberg · 4 years ago

I find that calling it a `racist algorithm` doesn't really do it any good unless the behaviour was intentional. This is a case of poor training data the same as google image classification messing up with tags.

staticshock · 4 years ago

Plenty of racism in humans isn't malicious, either, but is just a byproduct of bad training data. The outcome is bad regardless of what was intended, and it's the outcome that matters.

heavyset_go · 4 years ago

Let's say your company decides to use AI to assist in hiring. It turns out the algorithm used is biased when it comes to candidates' race. If there is a disparate impact[1] on protected classes in hiring that's unrelated to job performance, intentions don't matter in the eyes of labor law, what matters are the effects.

[1] https://en.wikipedia.org/wiki/Disparate_impact

bryanrasmussen · 4 years ago

>This is a case of poor training data

ok we need pictures of human faces, luckily I've got all these white people here!

on edit: it was racist in result, in that it empowers a racist system, it was not racist in intention - as in the people gathering the training data probably didn't say hey how can we empower a racist system with this?

sascha_sl · 4 years ago

Racism as a concept has evolved in meaning. It used to only include the most severe intentional cases of bigoted behavior, whereas now it also includes less obvious biases that lead to preventable but not necessarily intentional instances of everyday prejudice and bigotry.

I am for one happy we have unneutered the word from having to reach a bar so high, it wouldn't apply to most bigotry, but it is also unfortunate for people who have not caught on and believe calling a thing racist is a damning statement of evil intent, but it really is not anymore. Or those that insist on meaning of words remaining static forever.

Image cropping algorithms are hard. When we made our first one for reddit, it used this algorithm:

Find the larger dimension of the image. Remove either the first or last row/column of pixels, based on which had less entropy. Keep repeating until the image was a square.

The most notable "bias" of this algorithm was the male gaze problem identified in the article. Women's breasts tended to have more entropy than their face, so the algorithm focused on that since it was optimized for entropy. To solve the problem, we added software that allowed the user to choose their thumbnail, but not a lot of users used it or even realized they could.

I assume they've since upgraded it to use more AI with actual face detection and so on, but at the time, doing face detection on every image was computational infeasible.

fuzzythinker · 4 years ago

Breasts shouldn't have more entropy than face. Perhaps the reason is due to the breasts being in the middle of the picture, so the face gets being compared to bottom rows more frequently?

ad404b8a372f2b9 · 4 years ago

Why not? Shirts might have flashy patterns, differently colored fabrics, alternating skin and shirt. On a row-per-row basis I can see the chest area being more entropic than a face with an even skin tone.

edit: I googled "woman" and selected random pictures which showed the whole upper body, entropy summed over each row to the right: https://imgur.com/a/oVB57gu

mvzvm · 4 years ago

Reddit's image cropping algorithm is hilariously bad. As is their video player, and their ads, and their ranking, and their messaging tools...

LordAtlas · 4 years ago

And their search, and their new design. Pretty much everything about it sucks. They got lucky Digg screwed themselves over.

eyelidlessness · 4 years ago

And even just trying to access a post/thread on mobile. I already clicked the link, then I have to click it once more to say “yes I want to do the browser thing I explicitly did”, then another time still to actually show more than half a screen of content.

erichahn · 4 years ago

How is entropy related to "male gaze". This approach seems to be unsupervised, I don't see the problem.

SiempreViernes · 4 years ago

I don't think the claim is that the behaviour is caused by "male gaze", but rather that the outcome of always focusing the cropping around any visible cleavage is functionally identical.

skavi · 4 years ago

Whether or not it's unsupervised, whether or not it's sexist, it seems that a thumbnail focusing on a person's face rather than their breasts is typically going to be more desirable. Depending on context, of course.

teachingassist · 4 years ago

> This approach seems to be unsupervised, I don't see the problem.

Someone wrote and tested this algorithm, and either:

a) didn't test it on pictures of women, or,

b) didn't notice that it cropped breasts rather than faces, or,

c) didn't think that was a problem.

If they had noticed and cared, this wouldn't be the approach in use.

opsy2 · 4 years ago

How is entropy defined in this context?

Clearly there is human-derived input in the system (otherwise... What's the point just crop randomly)

Dead Comment

Deleted Comment

JimDabell · 4 years ago

I used a similar entropy-based image cropper in 2009 and saw a slightly different result – it had a foot fetish.

cblconfederate · 4 years ago

Suffice to say the twitter algorithm fails badly with NSFW images (where often the focus is ... not face)

LorenPechtel · 4 years ago

Except the article said that turned out not to be a factor.

amelius · 4 years ago

> Women's breasts tended to have more entropy than their face

Aha, perhaps that's the problem then.

codeulike · 4 years ago

cmckn · 4 years ago

So, I can choose to see only un-cropped images on my TL, and the author can see a preview of the algorithm's crop before they tweet -- but a glaring omission is simply exposing a crop tool to the author. The model works by choosing a point on which to center the crop. Why can't you give user's a UI to do the same? "Tap a focal point in the image, or let our robot decide!"

The blog post mentions several times how ML might not be the right choice for cropping; but their conclusion was...to keep using ML for cropping. I hope someone got a nice bonus for building the model!

nightpool · 4 years ago

> but their conclusion was...to keep using ML for cropping

My takeaway from the article was that their conclusion was to remove cropping from the product, starting incrementally on iOS. (I got cropping removed on Android as well recently). That seems like the opposite of "keep using ML for cropping"?

jawns · 4 years ago

I can't really see any down side, besides maybe a little bit of developer time, to allowing users to see a preview of the crop and optionally override. It's done all the time in other places.

jcims · 4 years ago

People are going to game it.

Cute puppy nose -> click -> porn ad.

mattacular · 4 years ago

It's probably a bit harder at Twitter's unique scale. They have an incredibly high throughput of new posts and a large portion of these posts include between 1-4 images that need cropping.

jedberg · 4 years ago

Areibman · 4 years ago

"We began testing a new way to display standard aspect ratio photos... without the saliency algorithm crop. The goal of this was to give people more control over how their images appear while also improving the experience of people seeing the images in their timeline. After getting positive feedback on this experience, we launched this feature to everyone."

So the solution all along was to give users the ability to crop their own photos. Why wasn't this the original way of doing things?

Instead of forcing a complicated algorithm into the Twitter experience, it seems to me that the solution all along was just to let users do what they do best-- make tweets for themselves. This incident strikes me as a major failing of AI: We are so eager to shoehorn AI/ML into our products that we lose sight of what actually makes users happy.

Next they'll tell us that chronological mode is better! (it is for me in any case)

I like it better too. And fortunately the setting doesn’t seem to be reverted at random as frequently as it used to.

bobbylarrybobby · 4 years ago

What’s really remarkable is that giving users the ability to manually crop would be an amazing way to gather data on optimal cropping, which they could have used to train their model down the road. I can only imagine how much more time and money went into gathering eye tracking data.

akersten · 4 years ago

If you were trying for real bias in your cropping algorithm, I would suspect training it on what the average, unconsciously-biased user thinks is the best crop nearly guarantees it.

ggggtez · 4 years ago

> Why wasn't this the original way of doing things?

Someone wanted to do a feature so they could get promoted. Probably with some mumbo jumbo about how it reduces the number of clicks to create a tweet and thus increases revenue.

natpat · 4 years ago

> One of our conclusions is that not everything on Twitter is a good candidate for an algorithm, and in this case, how to crop an image is a decision best made by people.

This seems like it should have been a foregone conclusion. What was the driving force in the first place to think cropping images with an AI model was desirable? Seems like ML was a solution looking for a problem here, and I'm glad they've realised that.

klodolph · 4 years ago

It seems obvious in retrospect. Calling it a foregone conclusion is too harsh.

Twitter crops photos to fit their preview formats. It seems like an obvious improvement to show people's faces when cropping, etc.

the_gastropod · 4 years ago

Right but... we've been cropping images in web applications since... y'know, pretty much ever. Using ML to do this was always pretty ridiculous overkill. Give the users an image cropper, and be done with it.

Thorrez · 4 years ago

Look at the before and after pictures of when they released the ML crop:

https://blog.twitter.com/engineering/en_us/topics/infrastruc...

All those examples show large improvement. Of course they might cherrypick images with large improvement for their blog advertising the feature. But still, it illustrates why people would think it's a good idea.

Of course they don't seem to consider the idea of not cropping at all.

gipp · 4 years ago

I'm more forgiving about corporate jargon than most. A lot of it really does help optimize communication for the situations you encounter in corporate work.

But "learnings" is literally, exactly, just a synonym for "lessons." Can we not?

alanbernstein · 4 years ago

I disagree, "Sharing lessons..." would mean "here is an educational resource that we have created, as teachers, for an audience of students". I think "Lessons learned..." is closer to what you mean to suggest, and "Learnings..." is more concise (this is from Twitter, after all).

robotresearcher · 4 years ago

Google 'lessons from' and you'll see that one usage is synonymous with 'learnings from'. The suggested completions are perfectly idiomatic.

kzrdude · 4 years ago

"Can we not?" is a pretty modern colloquialism. Is it better or worse than learnings?

NovemberWhiskey · 4 years ago

It's a neologism, or possibly the resurrection of a long unused form - I don't know exactly how we came about it but I agree completely that one of the meanings of "lesson" is "a thing which has been learned".

In my experience, there's a tendency toward folksiness in certain varieties of corpspeak that causes rejection of "formal-sounding" terms and repurposing of "plainer" forms to create new words, hence lessons = learnings, protégé = mentee, and so on.

jameshart · 4 years ago

I don't know why people object to the word 'learnings'. Do you also dislike the word 'teachings'?

bloak · 4 years ago

Also the "sharing". Is the article about "sharing"? Perhaps a normal human would have used the title "An evaluation of our image cropping algorithm".

dylan604 · 4 years ago

Maybe there's a bias to sharing since it is a social media platform where sharing is the thing?

boulos · 4 years ago

/rant but I feel like talking about percentage points of difference is always hard for humans. For example:

> In comparisons of men and women, there was an 8% difference from demographic parity in favor of women.

would have been clearer (and more correct) as "an 8 percentage-point difference from demographic parity". That 8 pp difference though is a 16% "relative" difference (58/50), or more starkly "The algorithm chose the woman almost 40% more often" (58/42 => 1.38). That said, the diagram in the post [1] is much easier for humans to parse and say "wow, that looks pretty far off!".

tl;dr: A number like 8% sounds like "no big deal", but 8 percentage points (on each side) is a big deal!

[1] https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter...

MauranKilom · 4 years ago

Or, summed up: https://xkcd.com/985/

fwip · 4 years ago

> In comparisons of black and white individuals, there was a 4% difference from demographic parity in favor of white individuals.

It's hard to believe that the bias was only 4% - there were a lot of people testing with images that they sourced themselves, and the preference for white people seemed much closer to 80-20.

The paper authors mention that their training data is from Wikidata (pictures of celebrities). I wonder if the types of photos in that dataset are meaningfully representative of the kinds of photos that people usually post to Twitter.

gwern · 4 years ago

> It's hard to believe that the bias was only 4% - there were a lot of people testing with images that they sourced themselves, and the preference for white people seemed much closer to 80-20.

It's very easy to believe the bias was near-zero given you are citing highly motivated people on Twitter cherrypicking from thousands of examples and a little baffling you find that to be more credible than controlled systematic experimentation; note, for example, the extremely striking fact that the fuss completely missed the other bias they found which was several times larger - showing how totally useless people on social media are for testing these things and how they can conjure up "80-20" biases which don't exist.

> highly motivated people on Twitter cherrypicking

One of the reference threads that identifies the issue happened on it by accident highlighting a surprising experience in another product (Zoom). Believe it or not, people who care about this stuff are not looking for stuff to complain about, we’re tired and overwhelmed. And I would hope that people who, upon discovering a vulnerability find and catalogue the ways it can be exploited, would be celebrated here.

I admit that it wasn't especially rigorous testing, but I personally tested this along with other people I knew. I used real photos from my camera and my wife's (we are of different races), featuring photos of ourselves, friends, and family.

I of course hope that the systems I use aren't racist against my loved ones. I am motivated to confirm whether or not they are, but I didn't go on to parlay my findings into an essay for clout. I gained nothing from doing this, except the knowledge that Twitter was suckier than I knew.