My favorite Python "crime" is that a class that defines __rrshift__, instantiated and used as a right-hand-side, lets you have a pipe operator, regardless of the left-hand-side (as long as it doesn't define __rshift__).
It's reasonably type-safe, and there's no need to "close" your chain - every outputted value as you write the chain can have a primitive type.
It shines in notebooks and live coding, where you might want to type stream-of-thought in the same order of operations that you want to take place. Need to log where something might be going wrong? Tee it like you're on a command line!
Idiomatic? Absolutely not. Something to push to production? Not unless you like being stabbed with pitchforks. Actually useful for prototyping? 1000%.
Having spent a lot of time lurking on the frustratingly-slow-moving bikeshedding thread for the Javascript pipe operator [0], there's a great irony that a lot of people want a pipe operator because they don't want to deal with function composition in any way, other than just applying a series of operations to their data!
I think there's a big gap pedagogically here. Once a person understands functional programming, these kinds of composition shorthands make for very straightforward and intuitive code.
But, if you're just understanding basic Haskell/Clojure syntax, or stuck in the rabbit hole of "monad is a monoid" style introductions, a beginner could easily start to think: "This functional stuff is really making me need to think in reverse, to need to know my full pipeline before I type a single character, and even perhaps to need to write Lisp-like (g (f x)) style constructs that are quite the opposite of the way my data is flowing."
I'm quite partial to tutorials like Railway Oriented Programming [1] which start from a practical imperative problem, embrace the idea that data and code should feel like they flow in the same direction, and gradually guide the reader to understanding the power of the functional tools they can bring to bear!
If anything, I hope this hack sparks good conversations :)
Sadly many things define the __or__ operator, including dicts and sets which are common to find in pipelines. (https://peps.python.org/pep-0584/ was a happy day for everyone but me!)
In practice, rshift gives a lot more flexibility! And you’d rarely chain after a numeric value.
Personally, I have never liked the PEP 634 pattern matching. I write a lot of code in Python. 99% of the time when I could use pattern matching, I am going to use simple if statements or dictionaries. Most of the time, they are more straightforward and easier to read, especially for developers who are more familiar with traditional control flow.
Dictionaries with a limited key and value type definition are fine, but dictionaries as a blind storage type are a recipe for crashing in prod with type errors or key errors. Structural pattern matching exists to support type safety.
I'll argue that code is in fact not easy to read if reading it doesn't tell you what type an item is and what a given line of code using it even does at runtime.
What’s the problem using it as a switch statement if you care about typographic issues? I do this so I’d like to know if I missed something and this is a bad practice.
I've never understood why Python's pattern-matching isn't more general.
First, "case foo.bar" is a value match, but "case foo" is a name capture. Python could have defined "case .foo" to mean "look up foo as a variable the normal way" with zero ambiguity, but chose not to.
Second, there's no need to special-case some builtin types as matching whole values. You can write "case float(m): print(m)" and print the float that matched, but you can't write "case MyObject(obj): print(obj)" and print your object. Python could allow "..." or "None" or something in __match_args__ to mean "the whole object", but didn't.
After doing Erlang and Scala pattern matching, the whole Python implementation just feels really ugly and gross. They should have cribbed a lot more of how Scala does it.
> While potentially useful, it introduces strange-looking new syntax without making the pattern syntax any more expressive. Indeed, named constants can be made to work with the existing rules by converting them to Enum types, or enclosing them in their own namespace (considered by the authors to be one honking great idea)[...]
If needed, the leading-dot rule (or a similar variant) could be added back later with no backward-compatibility issues.
second: you can use case MyObject() as obj: print(obj)
I don't think I've written a match-case yet. Aside from not having a lot of use cases for it personally, I find that it's very strange-feeling syntax. It tries too hard to look right, with the consequence that it's sometimes quite hard to reason about.
> > While potentially useful, it introduces strange-looking new syntax without making the pattern syntax any more expressive. Indeed, named constants can be made to work with the existing rules by converting them to Enum types, or enclosing them in their own namespace (considered by the authors to be one honking great idea)[...]
Yeah, and I don't buy that for a microsecond.
A leading dot is not "strange" syntax: it mirrors relative imports. There's no workaround because it lets you use variables the same way you use them in any other part of the language. Having to distort your program by adding namespaces that exist only to work around an artificial pattern matching limitation is a bug, not a feature.
Also, it takes a lot of chutzpah for this PEP author to call a leading dot strange when his match/case introduces something that looks lexically like constructor invocation but is anything but.
The "as" thing works with primitive too, so why do we need int(m)? Either get rid of the syntax or make it general. Don't hard-code support for half a dozen stdlib types for some reason and make it impossible for user code to do the equivalent.
The Python pattern matching API is full of most stdlib antipatterns:
* It's irregular: matching prohibits things that the shape of the feature would suggest are possible because the PEP authors couldn't personally see a specific use case for those things. (What's the deal with prohibiting multiple _ but allowing as many __ as you want?)
* It privileges stdlib, as I mentioned above. Language features should not grant the standard library powers it doesn't extend to user code.
* The syntax feels bolted on. I get trying to reduce parser complexity and tool breakage by making pattern matching look like object construction, but it isn't, and the false cognate thing confuses every single person who tries to read a Python program. They could have used := or some other new syntax, but didn't, probably because of the need to build "consensus"
* The whole damn thing should have been an expression, like the if/then/else ternary, not a statement useless outside many lexical contexts in which one might want to make a decision. Why is it a statement? Probably because the PEP author didn't _personally_ have a need to pattern match in expression context.
And look: you can justify any of these technical decisions. You can a way to justify anything you might want to do. The end result, however, is a language facility that feels more cumbersome than it should and is applicable to fewer places than one might think.
> If needed, the leading-dot rule (or a similar variant) could be added back later with no backward-compatibility issues.
So what, after another decade of debate, consensus, and compromise, we'll end up with a .-prefix-rule but one that works only if the character after the dot is a lowercase letter that isn't a vowel.
PEP: "We decided not to do this because inspection of real-life potential use cases showed that in vast majority of cases destructuring is related to an if condition. Also many of those are grouped in a series of exclusive choices."
I find this philosophical stance off-putting. It's a good thing when users find ways to use your tools in ways you didn't imagine.
PEP: In most other languages pattern matching is represented by an expression, not statement. But making it an expression would be inconsistent with other syntactic choices in Python. All decision making logic is expressed almost exclusively in statements, so we decided to not deviate from this.
We've had conditional expressions for a long time.
Could someone explain just what's so bad about this?
My best guess is that it adds complexity and makes code harder to read in a goto-style way where you can't reason locally about local things, but it feels like the author has a much more negative view ("crimes", "god no", "dark beating heart", the elmo gif).
Maybe I have too much of a "strongly typed language" view here, but I understood the utility of isinstance() as verifying that an object is, well, an instance of that class - so that subsequent code can safely interact with that object, call class-specific methods, rely on class-specific invariants, etc.
This also makes life directly easier for me as a programmer, because I know in what code files I have to look to understand the behavior of that object.
Even linters use it to that purpose, e.g. resolving call sites by looking at the last isinstance() statement to determine the type.
__subclasshook__ puts this at risk by letting a class lie about its instances.
A linter would pass this code without warnings, because it assumes that the if block is only entered if x is in fact an instance of Everything and therefore has the foo() method.
But what really happens is that the block is entered for any kind of object, and objects that don't happen to have a foo() method will throw an exception.
You _can_ write pathological code like the Everything example, but I can see this feature being helpful if used responsibly.
It essentially allows the user to check if a class implements an interface, without explicitly inheriting ABC or Protocol. It’s up to the user to ensure the body of the case doesn’t reference any methods or attributes not guaranteed by the subclass hook, but that’s not necessarily bad, just less safe.
I took the memes as largely for comedic effect, only?
I do think there is a ton of indirection going on in the code that I would not immediately think to look for. As the post stated, could be a good reason for this in some things. But it would be the opposite of aiming for boring code, at that point.
TL;DR having a class that determines if some other class is a subclass of itself based off of arbitrary logic and then using that arbitrary logic to categorize other people's arbitrary classes at runtime is sociopathic.
Some of these examples are similar in effect to what you might do in other languages, where you define an 'interface' and then you check to see if this class follows that interface. For example, you could define an interface DistancePoint which has the fields x and y and a distance() method, and then say "If this object implements this interface, then go ahead and do X".
Other examples, though, are more along the lines of if you implemented an interface but instead of the interface constraints being 'this class has this method' the interface constraints are 'today is Tuesday'. That's an asinine concept, which is what makes this crimes and also hilarious.
You better not find out about Protocols in Python then. The behavior you describe is exactly how duck typing / "structural subtyping" works. Your class will be an instance of Iterable if you implement the right methods having never known the Iterable class exists.
I don't find using __subclasshook__ to implement structural subtyping that you can't express with Protocols/ABCs alone to be that much of a crime. You can do evil with it but I can perform evil with any language feature.
The real crime is the design of Python's pattern matching in the first place:
match status:
case 404:
return "Not found"
not_found = 404
match status:
case not_found:
return "Not found"
Everywhere else in the language, you can give a constant a name without changing the code's behaviour. But in this case, the two snippets are very different: the first checks for equality (`status == 404`) and the second performs an assignment (`not_found = status`).
At least functional languages tend to have block scope, so the latter snippet introduces a new variable that shadows `not_found` instead of mutating it.
No, at least in Erlang a variable is assigned once, you can then match against that variable as it can't be reassigned:
NotFound = 404,
case Status of
NotFound -> "Not Found";
_ -> "Other Status"
end.
That snippet will return "Other Status" for Status = 400. The Python equivalent of that snippet is a SyntaxError as the first case is a catch all and the rest is unreachable.
Destructuring is a feature. Making it easy to confuse value capture and value reference was an error. Add single-namespace locals and you have a calamity.
Destructuring yes but you can still argue it's poorly designed. Particularly unintuitive because matching on a nested name e.g. module.CONSTANT works for matching and doesn't destructure. It's just the use of an unnested name which does destructuring.
What Python needs is what Elixir has. A "pin" operator that forces the variable to be used as its value for matching, rather than destructuring.
It wouldn't be a problem is Python had block level variable scope. Having that destructuring be limited to the 'case' would be fine, but obliterating the variable outside of the match is very unexpected.
The very first example there shows a match/case block where almost every single case just runs "pass" and yet every single one has a side effect. It's very difficult to read at first, difficult to understand if you're new to the syntax, and is built entirely around side effects. This might be one of the worst PEPs I've ever seen just based on that example alone.
Fun fact: you can do the same thing with the current match/case, except that you have to put your logic in the body of the case so that it's obvious what's happening.
Problem is, we already have a syntax for empty lists [], empty tuples (), and {} is taken for an empty dict. So having a syntax for an empty set actually makes sense to me
Making sense, and being good, is not necessary the same.
Yes, having a solution for this makes sense, but the proposed solutions are just not good. Sometimes one has to admit that not everything can be solved gracefully and just stop, hunting the whale.
Next step is to have the subclass check pack all the code up, send it to ChatGPT, ask it if it thinks subjectively that class A should be a subclass of class B, and then run sentiment analysis on the resulting text to make the determination.
It's reasonably type-safe, and there's no need to "close" your chain - every outputted value as you write the chain can have a primitive type.
It shines in notebooks and live coding, where you might want to type stream-of-thought in the same order of operations that you want to take place. Need to log where something might be going wrong? Tee it like you're on a command line!Idiomatic? Absolutely not. Something to push to production? Not unless you like being stabbed with pitchforks. Actually useful for prototyping? 1000%.
Looks a lot like function composition with the arguments flipped, which in Haskell is `>>>`. Neat!
But since you’re writing imperative code and binding the result to a variable, you could also compare to `>>=`.
(https://downloads.haskell.org/~ghc/7.6.2/docs/html/libraries...)
I think there's a big gap pedagogically here. Once a person understands functional programming, these kinds of composition shorthands make for very straightforward and intuitive code.
But, if you're just understanding basic Haskell/Clojure syntax, or stuck in the rabbit hole of "monad is a monoid" style introductions, a beginner could easily start to think: "This functional stuff is really making me need to think in reverse, to need to know my full pipeline before I type a single character, and even perhaps to need to write Lisp-like (g (f x)) style constructs that are quite the opposite of the way my data is flowing."
I'm quite partial to tutorials like Railway Oriented Programming [1] which start from a practical imperative problem, embrace the idea that data and code should feel like they flow in the same direction, and gradually guide the reader to understanding the power of the functional tools they can bring to bear!
If anything, I hope this hack sparks good conversations :)
[0] https://github.com/tc39/proposal-pipeline-operator/issues/91 - 6 years and 793 comments!
[1] https://fsharpforfunandprofit.com/rop/
In practice, rshift gives a lot more flexibility! And you’d rarely chain after a numeric value.
I wanted to wash my eyes the first time I saw it.
I'll argue that code is in fact not easy to read if reading it doesn't tell you what type an item is and what a given line of code using it even does at runtime.
First, "case foo.bar" is a value match, but "case foo" is a name capture. Python could have defined "case .foo" to mean "look up foo as a variable the normal way" with zero ambiguity, but chose not to.
Second, there's no need to special-case some builtin types as matching whole values. You can write "case float(m): print(m)" and print the float that matched, but you can't write "case MyObject(obj): print(obj)" and print your object. Python could allow "..." or "None" or something in __match_args__ to mean "the whole object", but didn't.
That said, I don't think OP's antics are a crime. That SyntaxError though, that might be a crime.
And a class-generating callable class would get around Python caching the results of __subclasshook__.
> While potentially useful, it introduces strange-looking new syntax without making the pattern syntax any more expressive. Indeed, named constants can be made to work with the existing rules by converting them to Enum types, or enclosing them in their own namespace (considered by the authors to be one honking great idea)[...] If needed, the leading-dot rule (or a similar variant) could be added back later with no backward-compatibility issues.
second: you can use case MyObject() as obj: print(obj)
Yeah, and I don't buy that for a microsecond.
A leading dot is not "strange" syntax: it mirrors relative imports. There's no workaround because it lets you use variables the same way you use them in any other part of the language. Having to distort your program by adding namespaces that exist only to work around an artificial pattern matching limitation is a bug, not a feature.
Also, it takes a lot of chutzpah for this PEP author to call a leading dot strange when his match/case introduces something that looks lexically like constructor invocation but is anything but.
The "as" thing works with primitive too, so why do we need int(m)? Either get rid of the syntax or make it general. Don't hard-code support for half a dozen stdlib types for some reason and make it impossible for user code to do the equivalent.
The Python pattern matching API is full of most stdlib antipatterns:
* It's irregular: matching prohibits things that the shape of the feature would suggest are possible because the PEP authors couldn't personally see a specific use case for those things. (What's the deal with prohibiting multiple _ but allowing as many __ as you want?)
* It privileges stdlib, as I mentioned above. Language features should not grant the standard library powers it doesn't extend to user code.
* The syntax feels bolted on. I get trying to reduce parser complexity and tool breakage by making pattern matching look like object construction, but it isn't, and the false cognate thing confuses every single person who tries to read a Python program. They could have used := or some other new syntax, but didn't, probably because of the need to build "consensus"
* The whole damn thing should have been an expression, like the if/then/else ternary, not a statement useless outside many lexical contexts in which one might want to make a decision. Why is it a statement? Probably because the PEP author didn't _personally_ have a need to pattern match in expression context.
And look: you can justify any of these technical decisions. You can a way to justify anything you might want to do. The end result, however, is a language facility that feels more cumbersome than it should and is applicable to fewer places than one might think.
Here's how to do it right: https://www.gnu.org/software/emacs/manual/html_node/elisp/pc...
> If needed, the leading-dot rule (or a similar variant) could be added back later with no backward-compatibility issues.
So what, after another decade of debate, consensus, and compromise, we'll end up with a .-prefix-rule but one that works only if the character after the dot is a lowercase letter that isn't a vowel.
PEP: "We decided not to do this because inspection of real-life potential use cases showed that in vast majority of cases destructuring is related to an if condition. Also many of those are grouped in a series of exclusive choices."
I find this philosophical stance off-putting. It's a good thing when users find ways to use your tools in ways you didn't imagine.
PEP: In most other languages pattern matching is represented by an expression, not statement. But making it an expression would be inconsistent with other syntactic choices in Python. All decision making logic is expressed almost exclusively in statements, so we decided to not deviate from this.
We've had conditional expressions for a long time.
My best guess is that it adds complexity and makes code harder to read in a goto-style way where you can't reason locally about local things, but it feels like the author has a much more negative view ("crimes", "god no", "dark beating heart", the elmo gif).
This also makes life directly easier for me as a programmer, because I know in what code files I have to look to understand the behavior of that object.
Even linters use it to that purpose, e.g. resolving call sites by looking at the last isinstance() statement to determine the type.
__subclasshook__ puts this at risk by letting a class lie about its instances.
As an example, consider this class:
You can now write code like this: A linter would pass this code without warnings, because it assumes that the if block is only entered if x is in fact an instance of Everything and therefore has the foo() method.But what really happens is that the block is entered for any kind of object, and objects that don't happen to have a foo() method will throw an exception.
It essentially allows the user to check if a class implements an interface, without explicitly inheriting ABC or Protocol. It’s up to the user to ensure the body of the case doesn’t reference any methods or attributes not guaranteed by the subclass hook, but that’s not necessarily bad, just less safe.
All things have a place and time.
I do think there is a ton of indirection going on in the code that I would not immediately think to look for. As the post stated, could be a good reason for this in some things. But it would be the opposite of aiming for boring code, at that point.
Some of these examples are similar in effect to what you might do in other languages, where you define an 'interface' and then you check to see if this class follows that interface. For example, you could define an interface DistancePoint which has the fields x and y and a distance() method, and then say "If this object implements this interface, then go ahead and do X".
Other examples, though, are more along the lines of if you implemented an interface but instead of the interface constraints being 'this class has this method' the interface constraints are 'today is Tuesday'. That's an asinine concept, which is what makes this crimes and also hilarious.
I don't find using __subclasshook__ to implement structural subtyping that you can't express with Protocols/ABCs alone to be that much of a crime. You can do evil with it but I can perform evil with any language feature.
https://x.com/brandon_rhodes/status/1360226108399099909
What Python needs is what Elixir has. A "pin" operator that forces the variable to be used as its value for matching, rather than destructuring.
Fun fact: you can do the same thing with the current match/case, except that you have to put your logic in the body of the case so that it's obvious what's happening.
Deleted Comment
Ruby's `case`/`in` has the same problem.
Yes, having a solution for this makes sense, but the proposed solutions are just not good. Sometimes one has to admit that not everything can be solved gracefully and just stop, hunting the whale.
Crimes with Python's pattern matching
406 points on Aug 2, 2022. 120 comments
https://news.ycombinator.com/item?id=32314368