Lyft’s algorithm is trying to block people with names like ‘Dick’ and ‘Cummings’

Besides the humor of how Lyft's filter flagged traditionally Caucasian names like "Cummings" – in addition to the usual issues with non-Western names, e.g. Pimpong and Poon – I'm fascinated/confused how this made it into production? The user database already exists and was currently being used by the live application. Before deploying this new filter onto the production database, wouldn't you do a dry run to get not only the count of users who will be flagged and notified, but a listing of frequently flagged names? Which you could easily manually eyeball to make sure there weren't obvious false positives?

With a userbase as big as Lyft's, I'm sure there were a ton of obvious true positives (anyone named "Fuck", ostensibly). I just can't believe they didn't notice a surname as relatively common as Cumming/Cummings. Yes, the apparent naivety of the regex is an issue, but this seems like a system for which a lot of testing on actual data would be easy and natural to do as part of the QA process.

spacechild1 · 6 years ago

> traditionally Caucasian names like "Cummings"

It always amazes me when US Americans talk in terms of human races like it's the most normal thing in the world. And no, “Cummings“ is not a “Caucasian name“ (whatever that is supposed to mean), it's an English name. I don't want sound too harsh, it's just that as a Central European with all our recent history, things like this really rub me the wrong way.

asveikau · 6 years ago

As an American I agree and this was also my reaction to the comment.

There are a number of these things, though. I personally get a little irritated with the way many people here use terms like "Hispanic" to contrast with "white people". To me, "Hispanic" is a wide spectrum of ethnicities but really more of a linguistic, cultural and national/political specifier, with lots of overlap with "white". People who are trying to be correct about this will pull out complicated phrases like "non-hispanic whites" which may be correct but are also clumsy, reflecting the absurdity of the way we categorize this country's social boundaries in the first place. "White" itself has had a fluid definition in the US over time, previously having excluded southern or eastern Europe, Jews, or even the Irish.

I know that in my own recent ancestry I have at least 4 European ethnicities, some of which would have been ridiculed by other Americans in the past. But this nuance washes away and I am a "white guy" which lumps me together with cultures I don't particularly identify with. There is no sense in denying, for example, that I am less discriminated against by appearance than other groups. But I do not think either that there is one homogeneous "white people" living in the United States.

coldcode · 6 years ago

Caucasian is a funny word to apply to Europeans since its a part of the area around present day Georgia and is a highly diverse ethnic mix of people which most people would classify as Central Asian (like Turks and Armenians for instance). Not exactly what you would think would be "white" which is what many Americans assume it means.

noelsusman · 6 years ago

Race is not a uniquely American concept. The Germans were the ones who first came up with the idea of a Caucasian race back in the 18th century. You're not wrong though, we should use White European instead since that's what it means today.

Dead Comment

userbinator · 6 years ago

positives (anyone named "Fuck", ostensibly).

This guy would disagree:

http://mw.eco.br/ig/prof/ReinhardtAdolfoFuck.htm

Along the same lines,

https://www.nie.edu.sg/profile/chew-shit-fun

danso · 6 years ago

Yeah, no name filter is going to be perfect for all edge cases. But presumably, it's a lot cheaper to have the Lyft support team handle email tickets from the relatively few Professor Fucks, than to have them also deal with every Cummings and Dick.

spacechild1 · 6 years ago

In Austria, we have many people called “Fick“, which literally means “Fuck“.

BTW, isn't it insulting to those people to suggest that their names are insulting?

jliptzin · 6 years ago

That second link actually made me spit out coffee for the first time in my life.

qwerty456127 · 6 years ago

This is all bullshit anyway. A word is just a word and whenever it insults somebody it's their own problem. I would rather change my actual name to Fuck if I wasn't too lazy to deal with problems like that.

I once read a story of a Vietnamese man named Hui. He had to go to the court to protect his right to be named this way in Ukraine where the word is used widely and has only one meaning - a dick.

reaperducer · 6 years ago

A word is just a word and whenever it insults somebody it's their own problem.

There was a study linked to on HN a couple of years ago that found that babies recognize swear words even before they have language skills, and that people can sometimes recognize swear words in languages they don't speak. It has to do with the tone and sharpness of the sounds.

gridlockd · 6 years ago

> A word is just a word and whenever it insults somebody it's their own problem.

That's what you think the world should be like, not the way it is. If your name is Adolfhitler Smith, a perfectly legal name in the US, then you will have trouble getting hired.

That's a made up example, but there's evidence[1] that unusual (not necessarily obscene) names will lead to less success in job applications.

If you have a "bad" name, it's your problem well before anyone else's.

[1] https://www.nber.org/digest/sep03/w9873.html

cortesoft · 6 years ago

This is a nonsensical argument. A word has meaning, which is why it is a word and not a sound.

If I yell insults at you, you are saying that doesn't mean anything? And if you get insulted, it is your own fault?

No, words have meaning based on a common understanding between the person saying the word and the person hearing the word. If I insult you and you understand it, it is me who did the insulting, it is not 'your problem'

You should try this argument out in court: "No, your honor... I can't be guilty of threatening to kill someone.. sure I said, 'I am going to kill you tomorrow at 5pm', but those were just words! If the person felt threatened, that is on them!"

gdy · 6 years ago

"A word is just a word"

Good luck using the N-word in the US.

reaperducer · 6 years ago

Since a credit card is required to use Lyft, why not just validate the name on the card with the card company and make it the issuer's problem?

danso · 6 years ago

I just tried to create a driver's account on Lyft's website. There's an initial page where you give them your name. And then, deeper into the process, a form for supplying your name as it officially appears on your driver's license.

Which makes sense. There's likely a lot of drivers who want to go by "Tom" instead of "Thomas" or whatever their full official name is – which is even more important for drivers with very uncommon names who want to go by something more familiar/pronounceable. There's probably a decent safety argument for not requiring drivers to have their full real names be accessible to the user, especially in cases where a driver has an uncommon (i.e. easy to doxx) name.

matthewarkin · 6 years ago

None of the major card networks except for American Express validate cardholder name. The Address Verification Service (AVS) also only validates numeric values (so if your address is 123 Main Street, 123 Maple Ave will return a matching response)

andrewbarba · 6 years ago

A credit card is not required to use Lyft. You can use credit from gift cards or promo codes, as well as 1 time payment methods like Apple Pay which are completed at time of ride booking and do not expose names

joshmn · 6 years ago

You can't validate the name on a credit card. There is no error code that indicates an invalid name.

dilap · 6 years ago

Why even try to catch “naughty” names. Why do I even need to give Lyft a name in the first place? It just seems like such a made-up problem.

jrockway · 6 years ago

I think you and the driver are supposed to exchange names. This is probably awkward when you name yourself "suck my balls" or something, which people undoubtedly do because people are terrible.

The Simpsons figured this out decades ago: https://www.youtube.com/watch?v=M1EjcWU6sEk

jdavis703 · 6 years ago

I’m guessing test and prod are highly-segregated to prevent improper data access. Remember when one of the app-based taxi companies had employees who spied on their exes via their system? I’m guessing they beefed up security to mitigate the insider threat risk.

Regardless, your point still stands. The product manager should have gone through whatever process was needed to pull down a limited set of production data to test against. Or the engineers could have logged exceptions in prod before making the filter active.

bequestry · 6 years ago

That end of year rush to finish projects in order to get them on your yearly review... :rolls eyes: Lots of corner cutting happens in december.

jfoster · 6 years ago

What's the upside of deploying this anyway? It's not as though it's similar to Facebook where these names are very publicly visible. It's just the drivers who would see the names.

They could just generate the list of questionable names and get support to give them a call about it or institute a rider identity verification programme.

dannyw · 6 years ago

Wouldn’t be surprised if this was a 20% project by an employee group of some sorts.

The intentions can be noble, such as a Lyft employee seeing an user named “F* N*” and thinking, hey that shouldn’t be okay.

The actual consequences can be meh.

clSTophEjUdRanu · 6 years ago

I had a friend named Hung Wang who was pissed Microsoft wouldn't let him make that his gamertag.

"I'm so sorry my name is offensive!"

somebodythere · 6 years ago

I have a friend named Aryan who was not allowed to sign up for a Microsoft account with that name. Support even refused to make the exception when asked, suggesting that he sign up with the name "Ryan" instead.

To this day, emails from that account have "Ryan <lastname>" in the header.

jorblumesea · 6 years ago

> Which you could easily manually eyeball to make sure there weren't obvious false positives?

Implying devs test their code thoroughly. lol. After working at numerous companies (some with high bars of entry) I've noticed many devs push changes without testing properly. Or, if they test, they rarely consider edge cases or implications of what they're doing and only test the obvious base cases.

tiemand · 6 years ago

In my team we called it the "Scunthorpe problem". Scunthorpe is a town in northern England.

daeken · 6 years ago

Tom Scott did a video on exactly that problem: https://www.youtube.com/watch?v=CcZdwX4noCE

Stratoscope · 6 years ago

Oh gosh, I grew up on The New Bob Cummings Show.

He had his own flying car, in 1961!

https://en.wikipedia.org/wiki/The_New_Bob_Cummings_Show

ginko · 6 years ago

As far as I can tell, the name "Cummings" is Gaelic, not Caucasian.

http://www.comins.org/cummingsname.htm

travbrack · 6 years ago

Or just let people use any name they want.

freepor · 6 years ago

https://deadspin.com/life-aint-easy-for-a-basketball-player-...

Deleted Comment

clouddrover · 6 years ago

> anyone named "Fuck", ostensibly

How very dare you. I'll have you know I come from a long line of Fuckers.

bigtunacan · 6 years ago

As someone already pointed out here, Fuck is a real last name. Fucker is also a real last name. You need a credit card to pay for Lyft; I don't quite follow why they can't just match people up with their credit card data.

Dead Comment

It seems like tech companies have been doing DIY profanity filters. Apple has a profanity filter for engraving text onto a device in the online store (https://twitter.com/minimaxir/status/1213188371452841984 ); intended to be your name, so terms like "Dick" have to not be filtered.

I learnt that Apple has a validation endpoint. Since the engraving service was recently updated, I took a profanity wordlist and checked it against the endpoint just for fun. The results are...counterintuitive: https://pastebin.com/mzpECiQw (NSFW language, obviously)

rubyn00bie · 6 years ago

A bit of a tangent but you reminded me-- and maybe, I'm totally full of shit or like my memory is wrong, but the iPhone predicting profanity was my favorite feature. For YEARS it would never write "duck" instead of "fuck," and generally was great, but then (and this is the like maybe I'm totally wrong part) after Jobs passed away that stopped completely... and now I get this bullshit I have to fucking correct all the time (I have had to set up shortcuts to prevent it from fixing my profanity).

I can't imagine Jobs putting up with trying to swear at someone over text and repeatedly getting "ducking."

unlinked_dll · 6 years ago

iOS autocorrect has always been terrible, it’s not exclusive to profanity. Both in terms of its performance and experience. Its incredibly frustrating to use when it autocorrects words 3-4 words later, and navigating text fields has become significantly more difficult on recent versions of iOS.

Android is the superior experience in almost every UX category imo, and text handling is the best example of that.

sent from my iPhone.

vernie · 6 years ago

Jobs had quite the puritanical streak when it came to his products.

qwerty456127 · 6 years ago

> semen: Valid

By the way Semen is the way the Russian equivalent to Simon is spelled officially. There is quite a number of Russian men who have just that for their first name in their passports. I once met one, he worked as a developer and people were making fun of his name.

9nGQluzmnq3M · 6 years ago

Surely Semyon would be the more common way of rendering семён?

danso · 6 years ago

The Apple endpoint, inconsistent as it was, was a little more permissive than I expected. At least compared to something like the Sony Playstation ID filter; for example, 'hitler' is fine for Apple, whereas Sony seems to block out any ID containing the literal string of 'hitle', e.g. `hitle123987` [0]. Meanwhile, Apple manages to block some of the more esoteric sexual profanity that Sony's filter can miss, e.g. `bunghole`.

[0] curl -X POST https://accounts.api.playstation.com/api/v1/accounts/onlineI... -H "Content-Type: application/json" -d '{"onlineId":"hitle123987","reserveIfAvailable":true}'

zeta0134 · 6 years ago

I'm still trying to work out what Sony's filter doesn't like about my username (zeta0134) which I've had for ages. The first bit is just the letter "z" in Spanish; there's no deeper meaning. To my knowledge it's not considered offensive in Spanish, but Sony would not let me have it no matter how many variations I tried. Maybe there's some political thing I'm not aware of?

Anyway, I switched languages and I'm "zed0134" on their network. The whole thing struck me as a bit odd.

TurkishPoptart · 6 years ago

What does this code do? Is it an API call? How can I run it myself?

ben_w · 6 years ago

The profanity filter in the Apple App Store was [0] amusingly wrong, but fortunately also a soft filter rather than a hard filter.

What happened was, the German localisation of the app description included the word “Knopf”. Knopf is not a rude word, according to any German I’ve discussed this with — it is one translation of “knob” in the sense of “button”, but Apple’s naughty word detector clearly thought it was “knob” in the sense of the euphemism for a body part.

It didn’t stop the app passing review, but the automatic warning was still a regular part of updates for that particular app.

[0] Back in 2012

dathinab · 6 years ago

Honestly if filter try to filter out euphemisms all is lost because:

1. People who use them for that just come up with new ones all the time

2. People still use the word in the regular sense, like would I now have to come up with a euphemisms to describe a door knob ?!

3. How I wore is my decision. As long as I don't hurry anyone intentionally or knowingly no person and even less company had the (moralic) right to constraint me. (Through wrt. Minirs, Y least younger ones, their parents opinion matters, too.)

userbinator · 6 years ago

I've seen "schwanz" (tail) more commonly used as the euphemism for the body part.

reaperducer · 6 years ago

I wonder if there are any Knopf titles in the Apple Books app.

https://en.wikipedia.org/wiki/Alfred_A._Knopf

drak0n1c · 6 years ago

Funnily enough, 'The Outer Worlds', an offline single-player RPG game, censors player names.

https://wireframe.raspberrypi.org/articles/banned-character-...

kova12 · 6 years ago

List does not sound arbitrary to me in a slightest