The Pythonic Emptiness

I don't find any of the arguments here particularly convincing. This claim in particular is weird:

> Similarly, when you use len() to check a sequence for emptiness, you are reaching out for a more powerful tool than necessary. As a result it may make the reader question the intent behind it, e.g. if there is a possibility of handling objects other than a built-in sequence type here?

Given that checking for truthiness is less strict than a length test, by the same token, whenever you use it, you're reaching for an even more powerful tool than necessary. And, if anything, seeing `not items` is what makes me question the intent - did the author mean to also check for None etc here, or are they just assuming that it's never going to be that? And sure, well-written code will have other checks and asserts about it - but when I'm reading your code, I don't know if it's missing an assert like that because you intended it, or because you couldn't be bothered to write it.

OTOH len() is very explicit about what is actually checked, and will fail fast and loudly if the argument is not actually a sequence.

Also note that it's not, strictly speaking, an either-or - you can use `not len(x)` instead of `len(x) == 0` if you want a distinctive pattern specifically for empty collection checks.

wenc · a year ago

For me, len() is much less ambiguous.

Also in Pandas if you try to check the truth value of a dataframe to see if it’s empty, it will fail. It will say “the truth value of a dataframe is ambiguous”.

df.empty is less ambiguous but you have to remember it specifically for dataframes.

But len(df) > 0 almost always works for any type of collection.

adamc · a year ago

I don't agree with this at all, and I wonder if it reflects what other languages you use that may have shaped your assumptions. `if mylist` feels very much like Common Lisp to me. In much of the code I've seen, the value would never be none because an empty list was definitely created.

Joker_vD · a year ago

Ah yes, reusing an empty list NIL as the false value because having a separate #f atom is bad for whatever reason.

Python's truthiness behavior was the trigger for one of my worst ever bugs early in my career, which not only pulled in senior engineering but also marketing/comms and legal to help sort out the mess. Not a fan!

adamc · a year ago

I think this needs more explanation to know if this is a good argument or not.

FreakLegion · a year ago

Keep in mind that truthiness comes from __bool__ and is overridable, so separate from Python itself, a lot of library authors have made questionable decisions here. A perennial contender is https://github.com/psf/requests/issues/2002.

aguaviva · a year ago

You know you're going to need to provide us with a little snippet demonstrating this behavior now, right?

lihaoyi · a year ago

I went and dug up the original code that caused an issue. Here it is:

https://github.com/python/cpython/blob/v2.7.1/Lib/cgi.py#L60...

Python's std lib `cgi.FieldStorage` object was falsy if it did not define any headers, even if it contained file data.

Thus my conditional trying to check whether a file was being uploaded "if request.field_storage" was going through the False branch when files were being uploaded but only in certain header-free scenarios not covered by automated and manual testing. This resulted in us dropping user data on the floor and losing uploaded files for a very large number of users before we realized and shut it off

The other sibling post contains another example where people may be confused, and Google pulls up others. But this is the concrete case that caused us to send out a hundred thousand apology emails to affected customers after losing their files

int_19h · a year ago

sjsdaiuasgdia · a year ago

I'm mostly annoyed that 'if len(items) == 0' / 'if len(items) > 0' aren't presented as options.

If we're talking about readability, they're far clearer than either of the options in the article and require no pre-knowledge of truthiness rules.

jasperry · a year ago

I agree that `len(seq) == 0` is more readable. I don't mind the recommended truth test of the sequence itself, but I have no idea who would use their "wrong" option with the length as a truth value. Or maybe POSIX exit codes (0 is success) have made me shy about using integers as truth values.

warkdarrior · a year ago

They are presented as options, it's just that their performance sucks. From the article:

    if not mylist:       # 1.061

    if len(mylist) == 0: # 1.924

James_K · a year ago

I feel like, if you care about performance, using Python at all is a mistake.

bogwog · a year ago

When you profile your code and find that all you gotta do is change a few if statements for a 2x perf boost, that will be a happy day.

Isn't this just a Perl feature (arrays are also their length and zero values are falsy)? I can't help but feel Python is getting closer to Perl as time goes on. Ironic, since their original goal was to be simple and make themselves distinct from Perl. What was the saying again? "There should be one-- and preferably only one --obvious way to do it." Honestly, I think it was all this "Zen" stuff that lead Python down the path to weirdness. This article reads like a monk interpreting sacred text. I can think of no good reason for all this malarkey.

jerf · a year ago

This can't be Python "getting closer" to Perl, since it's been this way in Python since inception, and I'm pretty sure, Perl as well. Both languages have always been exactly where they are now on the topic, with no motion, probably since the beginning, and certainly for multiple decades.

I'm not particularly well versed in the history of either language. I just know that Python was supposed to be "simple and intuitive", but I've had quite a different experience of it and this has often been down to Perl-like things going on.

Well, lists in Python are not their length. They are convertible to a boolean via __bool__(), however, which is how the "if" tests them (or any other object that has this method) for "truthiness".

mulmboy · a year ago

`if x: foo()` is a cancer on the Python community. Devs often use it with the intention of handling x being None, and carelessly lump in zero and empty lists/strings at the same time. Endless bugs.

smetj · a year ago

Yup!

FridgeSeal · a year ago

Python’s “truthiness” is a cutesy feature that is just an excuse for bugs in your code. It’s opaque/ too magic, exhibits poor readability and endless confusion.

Just use a normal check, like everyone is expecting to see.

“Oh but what if it’s not a sequence”, well then you have bigger problems. Why are you emptiness testing something that may-or-may-not-be-a-sequence? Maybe solve that problem first.

Chris_Newton · a year ago

Indeed. Relying on truthiness has always felt very un-Pythonic to me, not least because it contradicts several principles in the Zen of Python:

• Explicit is better than implicit.

• Special cases aren’t special enough to break the rules.

• There should be one — and preferably only one — obvious way to do it.

morkalork · a year ago

Holy moly, that meme about type checkers and variable names, someone is arguing for hungarian notation in 2024?!

pseudalopex · a year ago

Someone argued user_list, user_count, and has_users are clearer than users, users, and users. Will you argue the opposite?

The original Hungarian notation was a less readable implementation of the same idea. The Hungarian notation most people hated replaced functional types like count and index with data types like unsigned long. And used them everywhere.

wodenokoto · a year ago

The idea that `if len(items):` indicates that you expect other things than sequences seems backwards to me.

It’s the duck typing nature of, classic Python, that leads the community to recommend the broader `if items:`, which allows for numbers and such.