> Chaining together map/reduce/filter and other functional programming constructs (lambdas, iterators, comprehensions) may be concise, but long/multiple chains hurt readability
This is not at all implied by anything else in the article. This feels like a common "I'm unfamiliar with it so it's bad" gripe that the author just sneaked in. Once you become a little familiar with it, it's usually far easier to both read and write than any of the alternatives. I challenge anyone to come up with a more readable example of this:
By almost any complexity metric, including his, this code is going to beat the snot out of any other way of doing this. Please, learn just the basics of functional programming. You don't need to be able to explain what a Monad is (I barely can). But you should be familiar enough that you stop randomly badmouthing map and filter like you have some sort of anti-functional-programming Tourette's syndrome.
This comment seems unnecessarily mean-spirited... perhaps I just feel that way because I'm the person on the other end of it!
I agree the code you have there is very readable, but it's not really an example of what that sentence you quoted is referencing... However I didn't spell out exactly what I meant, so please allow me to clarify.
For me, roughly 5 calls in a chain is where things begin to become harder to read, which is the length of the example I used.
For the meaning of "multiple", I intended that to mean if there are nested chains or if the type being operated on changes, that can slow down the rate of reading for me.
Functional programming constructs can be very elegant, but it's possible to go overboard :)
To me the functional style is much more easy to parse as well. Maybe the lesson is that familiarity can be highly subjective.
I for example prefer a well chosen one-liner list comprehension in python over a loop with temporary variables and nested if statements most of the time. That is because usually people who use the list comprehension do not program it with side effects, so I know this block of code, once understood stands for itself.
The same is true for the builder style code. I just need to know what each step does and I know what comes out in the end. I even know that the object that was set up might become relevant later.
With the traditional imperative style that introduces intermediate variables I might infer that those are just temporary, but I can't be sure until I read on, keeping those variables in my head. Leaving me in the end with many more possible ways future code could play out. The intermediate variables have the benefit of clarifying steps, but you can have that with a builder pattern too if the interface is well-chosen (or if you add comments).
This is why in an imperative style variables that are never used should be marked (e.g. one convention is a underscore-prefix like _temporaryvalue — a language like Rust would even enforce this via compiler). But guess what: to a person unfamilar with that convention this increases mental complexity ("What is that weird name?"), while it factually should reduce it ("I don't have to keep that variable in the brain head as it won't matter in the future").
In the end many things boil down to familiarity. For example in electronics many people prefer to write a 4.7 kΩ as 4k7 instead, as it prevents you from accidentally overlooking the decimal point and making an accidental off-by-a-magnitude-error. This was particularly important in the golden age of the photocopier as you can imagine. However show that to a beginner and they will wonder what that is supposed to mean. Familiarity is subjective and every expert was once a beginner coming from a different world, where different things are familiar.
Something being familiar to a beginner (or someone who learned a particular way of doing X) is valuable, but it is not necessarily an objective measure of how well suited that representation is for a particular task.
The dig on chains of map/reduce/filter was listed as a "Halstead Complexity Takeaway", and seemed to come out of the blue, unjustified by any of the points made about Halstead complexity. In fact in your later funcA vs. funcB example, funcB would seem to have higher Halstead complexity due to its additional variables (depending on whether they count as additional "operands" or not). In general, long chains of functions seem like they'd have lower Halstead complexity.
The "anti-functional Tourette's" comment was partly a response to how completely random and unjustified it seemed in that part of the article, and also that this feels like a very common gut reaction to functional programming from people who aren't really willing to give it a try. I'm not only arguing directly against you here, but that attitude at large.
Your funcA vs. funcB example doesn't strike me as "functional" at all. No functions are even passed as arguments. That "fluent" style of long chains has been around in OO languages for a while, independent of functional programming (e.g. see d3.js*, which is definitely not the oldest). Sure, breaking long "fluent" chains up with intermediate variables can sometimes help readability. I just don't really get how any of this is the fault of functional programming.
I think part of the reason funcB seems so much more readable is that neither function's name explains what it's trying to do, so you go from 0 useful names to 3. If the function was called "getNamesOfVisibleNeighbors" it'd already close the readability gap a lot. Of course if it were called that, it'd be more clear that it might be just trying to do too much at once.
I view the "fluent" style as essentially embedding a DSL inside the host language. How readable it is depends a lot on how clear the DSL itself is. Your examples benefit from additional explanation partly because the DSL just seems rather inscrutable and idiosyncratic. Is it really clear what ".data()" is supposed to do? Sure, you can learn it, but you're learning an idiosyncrasy of that one library, not an agreed-upon language. And why do we need ".nodes()" after ".connected()"? What else can be connected to a node in a graph other than other nodes? Why do you need to repeat the word "node" in a string inside "graph.nodes()"? Why does a function with the plural "nodes" get assigned to a singular variable? As an example of how confusing this DSL is, you've claimed to find "visibleNames", but it looks to me like you've actually found the names of visible neighborNodes. It's not the names that are not(.hidden), it's the nodes, right? Consider this:
Note how much clearer ".filter(node => !node.isHidden)" is than ".not('.hidden')", and ".map(node => node.name)" versus ".data('name')". It's much harder to get confused about whether it's the node or the name that's hidden, etc.
Getting the DSL right is really hard, which only increases the benefit of using things like "map" and "filter" which everyone immediately understands, and which have no extrinsic complexity at all.
You could argue that it's somehow "invalid" to change the DSL, but my point is that if you're using the wrong tool for the job to begin with, then any further discussion of readability is in some sense moot. If you're doing a lot of logic on graphs, you should be dealing with a graph representation, not CSS classes and HTML attributes. Then the long chains are not an issue at all, because they read like a DSL in the actual domain you're working in.
*Sidenote: I hate d3's standard style, for some of the same reasons you mention, but mainly because "fluent" chains should never be mutating their operand.
>For me, roughly 5 calls in a chain is where things begin to become harder to read, which is the length of the example I used.
This isn't just about readability. Chaining or FP is structurally more sound. It is the more proper way to code from a architectural and structural pattern perspective.
given an array of numbers
1. I want to add 5 to all numbers
2. I want to convert to string
3. I want to concat hello
4. I want to create a reduced comma seperated string
5. I want to capitalize all letters in the string.
This is what a for loop would look like:
// assume x is the array
acc = ""
for(var i = 0, i < x.length; x++) {
value = x[i] + 5
value += 5
stringValue = str(value).concat(hello)
acc += stringValue + ","
}
for (var i = 0, i < acc.length; i++) {
acc[i] = capitalLetter(acc[i])
}
FP:
addFive(x) = [i + 5 for i in x]
toString(x) = [str(i) for i in x]
concatHello = [i + "hello" for i in x]
reduceStrings(x) = reduce((i, acc) = acc + "," + i, x)
capitalize(x) = ([capitalLetter(i) for i in x]).toString()
You have 5 steps. With FP all 5 steps are reuseable. With Procedural it is not.
Mind you that I know you're thinking about chaining. Chaining is eqivalent to inlining multiple operations together. So for example in that case
x.map(...).map(...).map(...).reduce(...).map(...)
//can be made into
addFive(x) = x.map(...)
toString(x)= x.map(...)
...
By nature functional is modular so such syntax can easily be extracted into modules with each module given a name. The procedural code cannot do this. It is structurally unsound and tightly coupled.
It's not about going overboard here. The FP simply needs to be formatted to be readable, but it is the MORE proper way to code to make your code modular general and decoupled.
Your example is a conceptually simple filter on a single list of items. But once the chain grows too long, the conditions become too complex, and there are too many lists/variables involved, it becomes impossible understand everything at once.
In a procedural loop, you can assign an intermediate result to a variable. By giving it a name, you can forget the processing you have done so far and focus on the next steps.
You don't ever need to "understand everything at once". You can read each stanza linearly. The for loop style is the approach where everything often needs to be understood all at once since the logic is interspersed throughout the entire body.
In a practical example you'd create a named intermediate type which becomes a new base for reasoning. Once you convinced yourself that the first part of the chain responsible for creating that type (or a collection of it) is correct, you can forget it and free up working memory to move on to the next part. The pure nature of the steps also makes them trivially testable as you can just call them individually with easy to construct values.
In fairness, if this was in a relational data store, the same code as above would probably look more like...
SELECT DISTINCT authors.some_field FROM books
JOIN authors ON books.author_id = authors.author_id
WHERE books.pageCount > 1000
And if you wanted to grab the entire authors record (like the code does) you'd probably need some more complexity in there:
SELECT * FROM authors WHERE author_id IN (
SELECT DISTINCT authors.author_id FROM books
JOIN authors ON books.author_id = authors.author_id
WHERE books.pageCount > 1000
)
This is 5 times more readable than FP example above for the same computation. The FP example uses variable book(s) five times, where using it once was sufficient for SQL. Perhaps FP languages could have learned something from SQL...
While I think your example is fine, I think the complaint was more about very long chains. Personally, I like to break them up to give intermediate results names, kinda like using variable names as comments:
var longBooks = books.filter(book => book.pageCount > 1000)
var authorsOfLongBooks = longBooks.map(book => book.author).distinct()
Good example actually. You started with a books array, and changed the type to authors half-way through.
To know the return type of the chain, I have to read through it and get to the end of each line.
A longBooks array, and map(longBooks, ‘author’) wouldn’t be much longer, but would involve more distinct and meaningful phrases.
I used to love doing chains! I used lodash all the time for things like this. It’s fun to write code this way. But now I see that it’s just a one-liner with line breaks.
One core reason chaining can be bad is robustness; another longevity/maintenance.
Specifically around type-safety, that is knowing that the chained type is what you expect and communicating that expectation to the person who is reading the code without them needing to know the wider context of both the chained-API nor the function the chain resides in. In the context of this article, that means more complexity, and therefore less readability.
I feel this is important because I have worked on many legacy code bases where bugs were found where chains were not behaving as expected, normally after attrition in some other part of the code base, and then you have to become a detective to work out the original intent.
For readability chains are bad, because they can lie about their intent, especially if there’s various semantics that can be swapped. But, like any industry or code base, if their use is consistent, and the api mature/stable, they can be powerful and fast, if.
FWIW, I'm plenty familiar with functional programming and iterator chains, and I still think for loops often beat them--not only from a "visual noise" perspective, but because complex iterator chains are harder to read than equivalent for loops (particularly when you have to deal with errors-as-values and short circuiting or other patterns) and for simple tasks iterator chains might be marginally simpler but the absolute complexity of the task is so low that a for loop is fine.
> But you should be familiar enough that you stop randomly badmouthing map and filter like you have some sort of anti-functional-programming Tourette's syndrome.
Fully agree on the easier to read part. But despite all "coffee is read more often than written" arguments, I see a lot of merit in the functional way. There are a hundred ways to write the loop slightly wrong, or with intended behavior slightly different from the regular. The functional variant: not so much. A small variation from the standard loop in functional style is visible. Very visible. Unintended ones simple won't happen, and the intended ones are an eyesore. An eyesore impossible to miss reading, whereas a subtle variation of the imperative loop is exactly that, subtle. Easy to miss. Readability advantage functional.
In my book, keeping simple things simple and the not simple things not simple beats simplicity everywhere. This is actually what I consider a big drawback of functional style: often the not-simple parts are way too condensed, almost indistinguishable from trivialities. But in the loop scenario it's often the reverse.
My happy place, when writing, would be an environment that has enough AST-level understanding to transform between both styles.
(other advantages of functional style: skills and habits transfer to both async and parallel, the imperative loop: not so much)
This is easy to read, but in reality I have found things to typically be a bit less straightforward.
Three things typically happens:
1. people who like these chains really like them. And I've seen multiple "one liner expressions" that was composed of several statements anded or ored together needing one or two line breaks.
2. when it breaks (and it does in the real world), debugging is a mess. At least last time I had to deal with it was no good way to put breakpoints in there. Maybe things have changed recently, but typically one had to rewrite it to classic programming and then debug it.
3. It trips up a lot of otherwise good programmers who hasn't seen it before.
I wouldn’t call 3 long. Which means you’ve picked the softball counterexample. If you were trying to play devil’s advocate, chose a longer legitimate one and show how a loop or other construct would make it better.
authors_of_long_books = set()
for book in books:
if len(book.pages) > 1000:
authors_of_long_books.add(book.author)
return authors_of_long_books
You are told explicitly at the beginning what the type of the result will be, you see that it's a single pass over books and that we're matching based on page count. There are no intermediate results to think about and no function call overhead.
When you read it out loud it's also it's natural, clear, and in the right order— "for each book if the book has more than 1000 pages add it to the set."
That isn't natural to anyone who is not intimately familiar with procedural programming. The language-natural phrasing would be "which of these books have more than thousand pages? Can you give me their authors?" -- which maps much closer to the parent's linq query than to your code.
> You are told explicitly at the beginning what the type of the result will be
I would argue that's a downside: you have to pick the appropriate data structure beforehand here, whereas .distinct() picks the data structure for you. If, in the future, someone comes up with a better way of producing a distinct set of things, the functional code gets that for free, but this code is locked into a particular way of doing things. Also, .distinct() tells you explicitly what you want, whereas the intention of set() is not as immediately obvious.
> There are no intermediate results to think about
I could argue that there aren't really intermediate results in my example either, depending on how you think about it. Are there intermediate results in the SQL query "SELECT DISTINCT Author FROM Books WHERE Books.PageCount > 1000"? Because that's very similar to how I mentally model the functional chain.
There are also intermediate results, or at least intermediate state, in your code: at any point in the loop, your set is in an intermediate state. It's not a big deal there either though: I'd argue you don't really think about that state either.
> and no function call overhead
That's entirely a language-specific thing, and volatile: new versions of a language may change how any of this stuff is implemented under the hood. It could be that "for ... in" happens to be a relatively expensive construct in some languages. You're probably right that the imperative code is slightly faster in most languages today, and if it has been shown via performance analysis that this particular code is a bottleneck, it makes sense to sacrifice readability in favor of performance. But it is a sacrifice in readability, and the current debate is over which is more readable in the first place.
> a single pass over books
Another detail that may or may not be true, and probably doesn't matter. The overhead of different forms of loops is just not what's determining the performance of almost any modern application. Also, my example could be a single pass if those methods were implemented in a lazy, "query builder" form instead of an immediately-evaluated form.
In fact, whether this query should be immediately evaluated is not necessarily this function's decision. It's nice to be able to write code that doesn't care about that. My example works the same for a wide variety of things that "books" could be, and the strategy to get the answer can be different depending on what it is. It's possible the result of this code is exactly the SQL I mentioned earlier, rather than an in-memory set. There are lots of benefits to saying what you want, instead of specifying exactly how you want it.
Maybe it's because I'm not familiar with such style, but I don't like how the code hides operational details. That is, if `books` contains one billion books, and the final result should contain about a hundred authors, how much extra memory does this use for intermediate results?
The best way to kick the tyres on this kind of question is to plug in something literally infinite. That way if you arrive at an answer you're probably doing something right with regard to space and time usage.
For example, use all the prime numbers as an expression in your chain.
import Data.Function
import Data.Numbers.Primes
main = do
let result :: [Int] = primes
& filter (startingWithDigit '5')
& asPairs
& map pairSum
& drop 100000
& take 10
print result
asPairs xs = zip xs (tail xs)
pairSum (a, b) = a + b
startingWithDigit d x = d == head (show x)
This is a valid concern I also reacted a little bit on. One thing to note though is that it is often possible to tell such chains to be lazy and only collect the end result at the end without ever generating any intermediary arrays.
Which require the author to actually have an idea how big the numbers are, but that is very often the case regardless of how you write your code.
Sometimes I write code like this. Then I delete it and replace it with a for loop, because a loop is just easier to understand.
This functional style is what I call write only code, because the only person who can understand it is the one who wrote it. Pandas loves this kind of method chaining, and it's one of the chief reasons pandas code is hard to read.
Chaining calls is an anti-pattern. Not only this is needless duplication of ye olde imperative statements sequence it also makes debugging, modifying ("oh I need to call some function in the middle of the chain, ugh"), and understanding harder for superficial benefit of it looking "cool".
It actively hurts maintainability, please stop using it.
There is a (large, I believe) aspect of good code that is fundamentally qualitative & almost literary. This annoys a lot of computer programmers (and academics) who are inclined to the mathematical mindset and want quantitative answers instead.
I love dostoyevsky and wodehouse, both wrote very well, but also very differently. While I don't think coding is quite that open a playing field, I have worked on good code bases that feel very different qualitatively. It often takes me a while to "get" the style of a code base, just as a new author make take a while for me to get.
I 100% agree with this. One of the best compliments I ever got (regarding programming) was from one of my principal engineers who said something along the lines of "your code reads like a story". He meant he could open a code file I had written, read from top to bottom and follow the 'narrative' in an easy way, because of how I'd ordered functions, but also how I created declarative implementations that would 'talk' to the reader.
I follow the pure functional programming paradigm which I think lends itself to this more narrative style. The functions are self contained in that their dependencies/inputs are the arguments provided or other pure functions, and the outputs are entirely in the return type.
This makes it incredibly easy to walk a reader through the complexity step-by-step (whereas other paradigms might have other complexities, like hidden state, for example). So, ironically, the most mathematically precise programming paradigm is also the best for the more narrative style (IMHO of course!)
I had a similar experience. I was a lead on a project where the client sent a functional expert to literally (at times) watch over my shoulder as I worked. He got very frustrated after a few weeks, seeing little in the way of code being laid down. He even complained to the project manager. That's because this was a complicated manufacturing system, and I was absorbing the necessary rules for it and designing it... a process that involved mostly sitting and thinking.
When I decided on the final design and basically barfed all the code out in a matter of days, I walked this guy (a non-programmer) through the code. He then wrote my manager a letter declaring it to be the "most beautiful code he had ever seen." I still have the Post-It she left in my cube telling me that.
I have little tolerance for untidy code, and also overly-clever syntax that wastes the reader's time trying to unravel it.
And now we have languages building more inconsistent and obscure syntax in as special-case options, wasting more time. Specifically I'm thinking about Swift; where, if the last parameter in a function call is a closure, it's a "trailing" closure and you can just ignore the function signature and plop the whole closure right there AFTER the closing parenthesis. Why?https://www.hackingwithswift.com/sixty/6/5/trailing-closure-...
This is just one example, and yeah... you can get used to it. But in this example, the language has undermined PARENTHESES, a notation that is almost universally understood to enclose things. When something that basic is out the window, you're dealing with language designers who lack an appreciation for human communication.
> The functions are self contained in that their dependencies/inputs are the arguments provided or other pure functions and the outputs are entirely in the return type.
Is this just a fancy way of saying static functions?
There’s a difference between simplifying a concept and stating it plainly.
I use this analogy a lot. Code can be like a novel, a short story, or a poem. A short story has to get to the point pretty quickly. A poem has to be even more so, but it relies either on shared context or extensive unpacking to be understood. It’s beautiful but not functional.
And there are a bunch of us short story writers who just want to get to the fucking point with a little bit of artistic flair, surrounded by a bunch of loud novel and mystery writers arguing with the loudest poets over which is right when they are both wrong. And then there’s that asshole over there writing haikus all the fucking time and expecting the rest of us to be impressed. The poets are rightfully intimidated but nobody else wants to deal with his bullshit.
I consider code bad if it takes more then 5 seconds to read and understand the high level goal of a function.
Doesn't matter how it looks. If its not possible to understand what a function accomplishes within a reasonable amount of time (without requiring hours upon hours of development experience), it's simply bad.
There is a call-stack depth problem here that is specific to codebases though. For one familiar with the the conventions, key data abstractions (not just data model but convention of how models are structured and relate) and key code abstractions, a well formed function is easy to understand. But someone relatively new to the codebase will need to take a bunch of time switching between levels to know what can be assumed about the state or control flow of the system in the context of when that function/subroutine is running. Better codebases avoid side-effects, but even with good separation there, non-trivial changes require strong reasoning about where to make changes in the system to avoid introducing side-effects and not just passing extra state or around all over the place.
So, I'd take "good architecture" with ok and above readability, over excellent readability but "poor architecture" any day. Where architecture in this context means the broader code structure of the whole project.
Sounds like you are content to limit yourself to problems that do not contain more irreducible complexity or require more developer context than what fits within five seconds of comprehension.
That's a good rule for straightforward CRUD apps and single-purpose backend systems, but as a universal declaration, "it is simply bad" is an ex cathedra metaphysical claim from someone who has mistaken their home village for the entirety of the universe.
> I consider code bad if it takes more then 5 seconds to read and understand the high level goal of a function.
That's something that's possible only for fairly trivial logic, though. Real code needs to be built on an internal "language" reflecting its invariants and data model and that's not something you can see with a microscope.
IMHO obsessive attention to microscope qualities (endless style nitpicking in code review, demands to run clang-format or whatever the tool du jour is on all submissions, style guides that limit function length, etc...) hurts and doesn't help. Good code, as the grandparent points out, is a heuristic quality and not one well-defined by rules like yours.
Code that looks like it has a bug in it but doesn’t will draw the eye over, and over, and over again when fishing for how regressions or bugs got into the code. This is the real cost of code smells. At some point it’s cheaper for me to clean up your mess than to keep walking past it every day. But I’m going to hate you a little bit every time I do.
Consider reading kernel or driver code. These areas have a huge amount of prerequisite knowledge that - I argue - makes it OK to violate the “understand at a glance” rule of thumb.
for the codebase i work on, i made a rule that "functions do what the name says and nothing else". this way if the function does too much, hopefully you feel dumb typing it and realize you should break it up.
Does this apply to all domains and all 'kinds' of code?
I feel like there's a fundamental difference in the information density between code that, for example, defines some kind of data structure (introducing a new 'shape' of data into an application) versus code that implements a known algorithm that might appear short in line length but carries a lot of information and therefore complexity.
This is what xmldoc/jsdoc/etc are for. If it's not 100% obvious from the name, put a summary of the function's assumptions, side effects, output, and possibly an example in the comment-doc. If you do this right, the next programmer will never have to read your source at all (or even navigate to your file! They'll hover over a method call or find it in the dot-autocomplete and see a little tootip with this documentation in it, and know all they need to know). It's an incredible thing when it works. It's a little bit more effort but I don't accept the FUD around "comments become out of date immediately because the code will change" etc. - that should be part of code review.
>This annoys a lot of computer programmers (and academics) who are inclined to the mathematical mindset and want quantitative answers instead.
I find many syntactical patterns that are considered elegant to be the opposite, and not as clear as mathematics, actually. For example, the the ternary operator mentioned in the article `return n % 2 === 0 ?'Even' : 'Odd;` feels very backwards to my human brain. It's better suited for the compiler to process the syntax tree rather than a human. A human mathematician would do something like this:
Well of course if you have the freedom to write a mathematical expression you're going to be able to present it in a way that is clearer than if you have to type monospace characters into a text editor.
I'm not sure it's realistic to expect to be able to type a mathematical expression using ascii more clearly than you can write it by hand (or implement using special unicode characters).
I understand that you might find the mathematical notation clearer but I think it's presumptuous of you to speak on behalf of all humans, or even all human mathematicians. I'm a mathematics graduate and I find the conditional operator more readable in a program because it corresponds to what the program actually does (it checks the condition first); but I also recognize that the two notations have exactly the same information content and only differ superficially in syntax, making it entirely a matter of familiarity.
This is why code reviews are so critical: They help keep a consistent style while onboarding new team members, and they help a team keep its style (reasonably) consistent.
The article's good, but misses my most mentally-fatiguing issue when reading code: mutability.
It is such a gift to be able to "lock in" a variable's meaning exactly once while reading a given method, and to hold it constant while reasoning about the rest of the method.
Your understanding of the method should monotonically increase from 0% to 100%, without needing to mentally "restart" the method because you messed up what the loop body did to an accumulator on a particular iteration.
This is the real reason why GOTOs are harmful: I don't have a hard time moving my mind's instruction-pointer around a method; I have a hard time knowing the state of mutable variables when GOTOs are in play.
Disagree. There's an abstract "information space" that the code is modeling, and you have to move around your mind's instruction pointer in that space. This can be helped or hindered by both mutable and immutable vars--it depends on how cleanly the code itself maps into that space. This can be a problem w/ both mutable and immutable vars. There's a slight tactical advantage to immutable vars b/c you don't have to worry about the value changing or it changing in a way that's misleading, but IME it's small and not worth adopting a "always use immutability" rule-of-thumb. Sometimes mutability makes it way easier to map into that "information space" cleanly.
> This is the real reason why GOTOs are harmful: I don't have a hard time moving my mind's instruction-pointer around a method; I have a hard time knowing the state of mutable variables when GOTOs are in play.
Well, total complexity is not only about moving the instruction pointer given a known starting point. Look at it from the callee’s pov instead of the call site. If someone can jump to a line, you can’t backtrack and see what happened before, because it could have come from anywhere. Ie you needed global program analysis, instead of local.
If mutability were the true source of goto complexity then if-statements and for loops have the same issue. While I agree mutability and state directly causes complexity, I think goto was in a completely different (and harmful) category.
While this is simple and all, the English words if/else don’t require the reader to know the ?: convention.
Depending on what background the reader may have, they could think of the set notation where it could mean “all the evens such that odd is true” which makes no sense. Its also very close to a key:value set notation. If/else leave no doubts for the majority of readers. It’s more inclusive if you will.
I feel guard clauses/early returns end up shifting developer focus on narrowing the function operation, and not an originally happy path with some after thought about other conditions it could handle.
IME else’s also end up leading to further nesting and evaluating edge cases or variables beyond the immediate scope of happy path (carrying all that context!).
I personally prefer the former as you can visually see the return one level of indentation below function name. It shows a guaranteed result barring no early-exits. Something about having the return embedded lower just seems off to me.
I would add a blank line to push 'return "Odd";' from the if, and also add brackets around the if-body if the language allows.
There are situations where I allow else, they tend to have side effects, but usually I refactor until I get rid of it because it'll come out clearer than it was. Commonly something rather convoluted turns into a sequence of guards where execution can bail ordered based on importance or execution cost. It isolates the actual function/method logic from the exit conditions.
The asymmetry is apparent if the code gets refactored to continuation/callback style or mutation of a more complex data structure, the first method will fall through and execute the second instruction set. Return is a special operator in this sense in that it breaks control flow and the ordinary control flow of the first method is not capturing the exhaustiveness of the two cases.
In idiomatic rust, return isn't used except for exceptional cases that break the control flow of the method and the second example is more commonly seen without return statements at all. Idiomatic python also typically early exits in the beginning with a return on invalid parameters or state with a tail position return being the usual actual return value. Because of these conventional practices, breaking the exhaustive if-else control structure makes the indented return appear exceptional (like an invalidity). If you follow these conventions than naturally the return statement begins to appear redundant except in the break-control-flow cases and the choice of the rust convention begins to make sense: in all languages return is a statement equivalent to break.
Nice example of how subjective this is. I immediately thought the first one without "else" is clearly the winner.
This is the problem with formatting rules. A codebase needs to have consistent style, even though that might mean nobody is fully happy with it.
I for example can not stand semicolons in JavaScript. It is just a visual clutter that is completely redundant, and yet some people really want it there.
If it's this short, the ternary operator would be the absolute best option IMHO.
If any of the clauses are much longer, the first option reads a lot better if it can be a guard cause that returns very quick.
If neither options are short I'd argue they should be pushed away into scoped and named blocks (e.g. a function) and we're back to either a ternary operation or a guard like clause.
Maybe it's just me, but TypeScript makes code hard to read.
It's fine if the data model is kept somewhat "atomic" and devs are diligent about actually declaring and documenting types (on my own projects, I'm super diligent about this).
But once types start deriving from types using utility functions and then devs slack and fall back to type inference (because they skip an explicit type), it really starts to unravel because it's very hard to trace fields back to their origin in a _deep_ stack (like 4-5 levels of type indirection; some inferred, some explicit, some derived, some fields get aliased...).
type Dog = {
breed: string
size: "lg" | "md" | "sm"
// ...
}
type DogBreedAndSize = Pick<Dog, "breed" | "size">
function checkDogs(dogs: Dog[]) : DogBreedAndSize[] {
return dogs.map(d => /* ... */)
}
const checkedDoggos = checkDogs([])
Versus:
function checkDogs(dogs: Dog[]) {
// ...
}
Very subtle, but for large data models with deep call stacks, the latter is completely unusable and absolutely maddening.
I agree that functions should probably specify their output type, MOSTLY to enforce that all paths that return from that function must adhere to that type
I've seen plenty of regressions where someone added a new condition to a function and then returned a slightly different type than other branches did, and it broke things
However, I don't think there is much value in putting types on variable declarations
In your example,
`const checkedDoggos = checkDogs([])` is good. Just let checkedDoggos inherit the type from the function
I have a codebase I'm working on where the linter enforces
I want it on the other side (on the function return) so that it's consistently displayed in type hints and intellisense so I don't have to navigate the code backwards 3-4 layers to find the root type (do you see what I'm saying?)
^^^ That's where it's important to not skip the type def because then I can see the root type in the editor hints and I don't need to dig into the call stack (I know the end result is the same whether it's on the assignment side or the declaration side, but it feels like ensuring it's always on the declaration side is where the value is)
I'd prefer to have some type information over nothing if the choice were between TypeScript with some inferred return types, versus JavaScript where you're never really sure and constantly have to walk back up/down the stack and keep it in your mind.
I'd say on backend, my preference is statically something like C#. Statically typed but enough type flexibility to be interesting (tuples, anonymous types, inferred types, etc)
Smaller functions with fewer variables are generally easier to read
I hate how a lot of focus on "readability" is on micro-readability, which then tends to encourage highly fragmented code under the highly misguided assumption that micro-readability is more important than macro-readability. The dogma-culting around this then breeds plenty of programmers who can't see the forest for the trees and end up creating grossly inefficient code and/or have difficulty with debugging.
APL-family languages are at the other extreme, although I suspect the actual optimum is somewhere in the middle and highly dependent on the individual.
There's certainly a middle ground, especially when multiple file are involved. 3-4 go to definitions in something i'm not familiar with and i'm struggling, now that's a me problem but i can't imagine most people are miles ahead of me.
.Net culture, especially with "clean architecture" is shocking for this, you go to modify a feature or troubleshoot and things are spread across 4 layers and 15 files, some that are > 60% keywords.
I don’t have an answer of where the cutoff is but I'll generally take 1 longer function that's otherwise neat and following the other recommendations outlined that I can read sequentially instead of scrolling up and down every 5 lines because it's so fragmented. same can be said for types/classes too, that 4 value enum used only for this DTO does not need to be in another file!
What’s tragic is this is completely self-inflicted and you could argue is done against the kind of code structure that is more idiomatic to C#. Luckily, you don’t have to do it if you are not already working with this type of codebase.
This is an interesting article, but also rather unsatisfying. It very quickly jumps to conclusions and goes right back to opinion. I agree with several of those opinions, but opinion was explicitly not the point of the article.
> Prefer to not use language-specific operators or syntactic sugars, since additional constructs are a tax on the reader.
I don't think this follows from the metric. If a function contains three distinct operators, a language-specific operator that replaces all three of them in one go would reduce the "effort" of function. It's highly scenario-specific.
> Chaining together map/reduce/filter and other functional programming constructs (lambdas, iterators, comprehensions) may be concise, but long/multiple chains hurt readability
I don't think this follows either. One effect of these constructs when used right is that they replace other operators and reduce the "volume". Again this can go both ways.
> ...case in point, these code snippets aren’t actually equivalent!
That's a very language-specific diagnosis, and arguably points at hard-to-read language design in JS. The snippet otherwise doesn't look like JS, but I'm not aware of another language for which this would apply. Indeed it is also commonly known as a "null-safe operator", because most languages don't have separate "null" and "undefined".
> variable shadowing is terrible
> long liveness durations force the reader to do keep more possible variables and variables in their head.
These can arguably be contradictory, and that is why I am a huge fan of variable shadowing in some contexts: By shadowing a variable you remove the previous instance from scope, rather than keeping both available.
There's a cool plugin for vscode called Highlight[1] that lets you set custom regexes to apply different colors to your code. I think a common use of this is to make //TODO comments yellow, but I use it to de-emphasize logs, which add a lot of visual noise because I put them EVERYWHERE. The library I maintain uses logs that look like:
this.logger?.info('Some logs here');
So I apply 0.4 opacity to it so that it kind of fades into the background. It's still visible, but at a glance, the actual business logic code pops out at you. This is my configuration for anyone who wants to modify it:
This is not at all implied by anything else in the article. This feels like a common "I'm unfamiliar with it so it's bad" gripe that the author just sneaked in. Once you become a little familiar with it, it's usually far easier to both read and write than any of the alternatives. I challenge anyone to come up with a more readable example of this:
By almost any complexity metric, including his, this code is going to beat the snot out of any other way of doing this. Please, learn just the basics of functional programming. You don't need to be able to explain what a Monad is (I barely can). But you should be familiar enough that you stop randomly badmouthing map and filter like you have some sort of anti-functional-programming Tourette's syndrome.I agree the code you have there is very readable, but it's not really an example of what that sentence you quoted is referencing... However I didn't spell out exactly what I meant, so please allow me to clarify.
For me, roughly 5 calls in a chain is where things begin to become harder to read, which is the length of the example I used.
For the meaning of "multiple", I intended that to mean if there are nested chains or if the type being operated on changes, that can slow down the rate of reading for me.
Functional programming constructs can be very elegant, but it's possible to go overboard :)
I for example prefer a well chosen one-liner list comprehension in python over a loop with temporary variables and nested if statements most of the time. That is because usually people who use the list comprehension do not program it with side effects, so I know this block of code, once understood stands for itself.
The same is true for the builder style code. I just need to know what each step does and I know what comes out in the end. I even know that the object that was set up might become relevant later.
With the traditional imperative style that introduces intermediate variables I might infer that those are just temporary, but I can't be sure until I read on, keeping those variables in my head. Leaving me in the end with many more possible ways future code could play out. The intermediate variables have the benefit of clarifying steps, but you can have that with a builder pattern too if the interface is well-chosen (or if you add comments).
This is why in an imperative style variables that are never used should be marked (e.g. one convention is a underscore-prefix like _temporaryvalue — a language like Rust would even enforce this via compiler). But guess what: to a person unfamilar with that convention this increases mental complexity ("What is that weird name?"), while it factually should reduce it ("I don't have to keep that variable in the brain head as it won't matter in the future").
In the end many things boil down to familiarity. For example in electronics many people prefer to write a 4.7 kΩ as 4k7 instead, as it prevents you from accidentally overlooking the decimal point and making an accidental off-by-a-magnitude-error. This was particularly important in the golden age of the photocopier as you can imagine. However show that to a beginner and they will wonder what that is supposed to mean. Familiarity is subjective and every expert was once a beginner coming from a different world, where different things are familiar.
Something being familiar to a beginner (or someone who learned a particular way of doing X) is valuable, but it is not necessarily an objective measure of how well suited that representation is for a particular task.
seeinglogic's article made me think of a 3rd option:
1. Sorta long functional chain where the type changes partway through 2. Use temp variables 3. (New option) Use comments
(Here's funcA from seeinglogic's article, but I added 3 comments)
Compare to funcB which uses temp variables: For me the commented version is easier to read and audit and it also feels safer for some reason, but I'm not how subjective that is.The "anti-functional Tourette's" comment was partly a response to how completely random and unjustified it seemed in that part of the article, and also that this feels like a very common gut reaction to functional programming from people who aren't really willing to give it a try. I'm not only arguing directly against you here, but that attitude at large.
Your funcA vs. funcB example doesn't strike me as "functional" at all. No functions are even passed as arguments. That "fluent" style of long chains has been around in OO languages for a while, independent of functional programming (e.g. see d3.js*, which is definitely not the oldest). Sure, breaking long "fluent" chains up with intermediate variables can sometimes help readability. I just don't really get how any of this is the fault of functional programming.
I think part of the reason funcB seems so much more readable is that neither function's name explains what it's trying to do, so you go from 0 useful names to 3. If the function was called "getNamesOfVisibleNeighbors" it'd already close the readability gap a lot. Of course if it were called that, it'd be more clear that it might be just trying to do too much at once.
I view the "fluent" style as essentially embedding a DSL inside the host language. How readable it is depends a lot on how clear the DSL itself is. Your examples benefit from additional explanation partly because the DSL just seems rather inscrutable and idiosyncratic. Is it really clear what ".data()" is supposed to do? Sure, you can learn it, but you're learning an idiosyncrasy of that one library, not an agreed-upon language. And why do we need ".nodes()" after ".connected()"? What else can be connected to a node in a graph other than other nodes? Why do you need to repeat the word "node" in a string inside "graph.nodes()"? Why does a function with the plural "nodes" get assigned to a singular variable? As an example of how confusing this DSL is, you've claimed to find "visibleNames", but it looks to me like you've actually found the names of visible neighborNodes. It's not the names that are not(.hidden), it's the nodes, right? Consider this:
Note how much clearer ".filter(node => !node.isHidden)" is than ".not('.hidden')", and ".map(node => node.name)" versus ".data('name')". It's much harder to get confused about whether it's the node or the name that's hidden, etc.Getting the DSL right is really hard, which only increases the benefit of using things like "map" and "filter" which everyone immediately understands, and which have no extrinsic complexity at all.
You could argue that it's somehow "invalid" to change the DSL, but my point is that if you're using the wrong tool for the job to begin with, then any further discussion of readability is in some sense moot. If you're doing a lot of logic on graphs, you should be dealing with a graph representation, not CSS classes and HTML attributes. Then the long chains are not an issue at all, because they read like a DSL in the actual domain you're working in.
*Sidenote: I hate d3's standard style, for some of the same reasons you mention, but mainly because "fluent" chains should never be mutating their operand.
This isn't just about readability. Chaining or FP is structurally more sound. It is the more proper way to code from a architectural and structural pattern perspective.
This is what a for loop would look like: FP: You have 5 steps. With FP all 5 steps are reuseable. With Procedural it is not.Mind you that I know you're thinking about chaining. Chaining is eqivalent to inlining multiple operations together. So for example in that case
By nature functional is modular so such syntax can easily be extracted into modules with each module given a name. The procedural code cannot do this. It is structurally unsound and tightly coupled.It's not about going overboard here. The FP simply needs to be formatted to be readable, but it is the MORE proper way to code to make your code modular general and decoupled.
In a procedural loop, you can assign an intermediate result to a variable. By giving it a name, you can forget the processing you have done so far and focus on the next steps.
SELECT DISTINCT authors.some_field FROM books JOIN authors ON books.author_id = authors.author_id WHERE books.pageCount > 1000
And if you wanted to grab the entire authors record (like the code does) you'd probably need some more complexity in there:
SELECT * FROM authors WHERE author_id IN ( SELECT DISTINCT authors.author_id FROM books JOIN authors ON books.author_id = authors.author_id WHERE books.pageCount > 1000 )
Deleted Comment
To know the return type of the chain, I have to read through it and get to the end of each line.
A longBooks array, and map(longBooks, ‘author’) wouldn’t be much longer, but would involve more distinct and meaningful phrases.
I used to love doing chains! I used lodash all the time for things like this. It’s fun to write code this way. But now I see that it’s just a one-liner with line breaks.
Specifically around type-safety, that is knowing that the chained type is what you expect and communicating that expectation to the person who is reading the code without them needing to know the wider context of both the chained-API nor the function the chain resides in. In the context of this article, that means more complexity, and therefore less readability.
I feel this is important because I have worked on many legacy code bases where bugs were found where chains were not behaving as expected, normally after attrition in some other part of the code base, and then you have to become a detective to work out the original intent.
For readability chains are bad, because they can lie about their intent, especially if there’s various semantics that can be swapped. But, like any industry or code base, if their use is consistent, and the api mature/stable, they can be powerful and fast, if.
> But you should be familiar enough that you stop randomly badmouthing map and filter like you have some sort of anti-functional-programming Tourette's syndrome.
I've been moderated for saying much tamer, FYI.
In my book, keeping simple things simple and the not simple things not simple beats simplicity everywhere. This is actually what I consider a big drawback of functional style: often the not-simple parts are way too condensed, almost indistinguishable from trivialities. But in the loop scenario it's often the reverse.
My happy place, when writing, would be an environment that has enough AST-level understanding to transform between both styles.
(other advantages of functional style: skills and habits transfer to both async and parallel, the imperative loop: not so much)
Three things typically happens:
1. people who like these chains really like them. And I've seen multiple "one liner expressions" that was composed of several statements anded or ored together needing one or two line breaks.
2. when it breaks (and it does in the real world), debugging is a mess. At least last time I had to deal with it was no good way to put breakpoints in there. Maybe things have changed recently, but typically one had to rewrite it to classic programming and then debug it.
3. It trips up a lot of otherwise good programmers who hasn't seen it before.
SELECT DISTINCT authors FROM books WHERE page_count > 1000;
Three dots is just a random Tuesday.
this could be something more like
distinct
i think fp looks pretty terrible in js, rust, python, etcWhen you read it out loud it's also it's natural, clear, and in the right order— "for each book if the book has more than 1000 pages add it to the set."
That isn't natural to anyone who is not intimately familiar with procedural programming. The language-natural phrasing would be "which of these books have more than thousand pages? Can you give me their authors?" -- which maps much closer to the parent's linq query than to your code.
> ... no function call overhead.
This code has more function calls. O(n) vs 3 for the original
I would argue that's a downside: you have to pick the appropriate data structure beforehand here, whereas .distinct() picks the data structure for you. If, in the future, someone comes up with a better way of producing a distinct set of things, the functional code gets that for free, but this code is locked into a particular way of doing things. Also, .distinct() tells you explicitly what you want, whereas the intention of set() is not as immediately obvious.
> There are no intermediate results to think about
I could argue that there aren't really intermediate results in my example either, depending on how you think about it. Are there intermediate results in the SQL query "SELECT DISTINCT Author FROM Books WHERE Books.PageCount > 1000"? Because that's very similar to how I mentally model the functional chain.
There are also intermediate results, or at least intermediate state, in your code: at any point in the loop, your set is in an intermediate state. It's not a big deal there either though: I'd argue you don't really think about that state either.
> and no function call overhead
That's entirely a language-specific thing, and volatile: new versions of a language may change how any of this stuff is implemented under the hood. It could be that "for ... in" happens to be a relatively expensive construct in some languages. You're probably right that the imperative code is slightly faster in most languages today, and if it has been shown via performance analysis that this particular code is a bottleneck, it makes sense to sacrifice readability in favor of performance. But it is a sacrifice in readability, and the current debate is over which is more readable in the first place.
> a single pass over books
Another detail that may or may not be true, and probably doesn't matter. The overhead of different forms of loops is just not what's determining the performance of almost any modern application. Also, my example could be a single pass if those methods were implemented in a lazy, "query builder" form instead of an immediately-evaluated form.
In fact, whether this query should be immediately evaluated is not necessarily this function's decision. It's nice to be able to write code that doesn't care about that. My example works the same for a wide variety of things that "books" could be, and the strategy to get the answer can be different depending on what it is. It's possible the result of this code is exactly the SQL I mentioned earlier, rather than an in-memory set. There are lots of benefits to saying what you want, instead of specifying exactly how you want it.
For example, use all the prime numbers as an expression in your chain.
> [100960734,100960764,100960792,100960800,100960812]> 3 MiB total memory in use (0 MB lost due to fragmentation)
Which require the author to actually have an idea how big the numbers are, but that is very often the case regardless of how you write your code.
This functional style is what I call write only code, because the only person who can understand it is the one who wrote it. Pandas loves this kind of method chaining, and it's one of the chief reasons pandas code is hard to read.
It actively hurts maintainability, please stop using it.
I love dostoyevsky and wodehouse, both wrote very well, but also very differently. While I don't think coding is quite that open a playing field, I have worked on good code bases that feel very different qualitatively. It often takes me a while to "get" the style of a code base, just as a new author make take a while for me to get.
I follow the pure functional programming paradigm which I think lends itself to this more narrative style. The functions are self contained in that their dependencies/inputs are the arguments provided or other pure functions, and the outputs are entirely in the return type.
This makes it incredibly easy to walk a reader through the complexity step-by-step (whereas other paradigms might have other complexities, like hidden state, for example). So, ironically, the most mathematically precise programming paradigm is also the best for the more narrative style (IMHO of course!)
When I decided on the final design and basically barfed all the code out in a matter of days, I walked this guy (a non-programmer) through the code. He then wrote my manager a letter declaring it to be the "most beautiful code he had ever seen." I still have the Post-It she left in my cube telling me that.
I have little tolerance for untidy code, and also overly-clever syntax that wastes the reader's time trying to unravel it.
And now we have languages building more inconsistent and obscure syntax in as special-case options, wasting more time. Specifically I'm thinking about Swift; where, if the last parameter in a function call is a closure, it's a "trailing" closure and you can just ignore the function signature and plop the whole closure right there AFTER the closing parenthesis. Why?https://www.hackingwithswift.com/sixty/6/5/trailing-closure-...
This is just one example, and yeah... you can get used to it. But in this example, the language has undermined PARENTHESES, a notation that is almost universally understood to enclose things. When something that basic is out the window, you're dealing with language designers who lack an appreciation for human communication.
Is this just a fancy way of saying static functions?
I use this analogy a lot. Code can be like a novel, a short story, or a poem. A short story has to get to the point pretty quickly. A poem has to be even more so, but it relies either on shared context or extensive unpacking to be understood. It’s beautiful but not functional.
And there are a bunch of us short story writers who just want to get to the fucking point with a little bit of artistic flair, surrounded by a bunch of loud novel and mystery writers arguing with the loudest poets over which is right when they are both wrong. And then there’s that asshole over there writing haikus all the fucking time and expecting the rest of us to be impressed. The poets are rightfully intimidated but nobody else wants to deal with his bullshit.
Doesn't matter how it looks. If its not possible to understand what a function accomplishes within a reasonable amount of time (without requiring hours upon hours of development experience), it's simply bad.
So, I'd take "good architecture" with ok and above readability, over excellent readability but "poor architecture" any day. Where architecture in this context means the broader code structure of the whole project.
That's a good rule for straightforward CRUD apps and single-purpose backend systems, but as a universal declaration, "it is simply bad" is an ex cathedra metaphysical claim from someone who has mistaken their home village for the entirety of the universe.
That's something that's possible only for fairly trivial logic, though. Real code needs to be built on an internal "language" reflecting its invariants and data model and that's not something you can see with a microscope.
IMHO obsessive attention to microscope qualities (endless style nitpicking in code review, demands to run clang-format or whatever the tool du jour is on all submissions, style guides that limit function length, etc...) hurts and doesn't help. Good code, as the grandparent points out, is a heuristic quality and not one well-defined by rules like yours.
Consider reading kernel or driver code. These areas have a huge amount of prerequisite knowledge that - I argue - makes it OK to violate the “understand at a glance” rule of thumb.
I feel like there's a fundamental difference in the information density between code that, for example, defines some kind of data structure (introducing a new 'shape' of data into an application) versus code that implements a known algorithm that might appear short in line length but carries a lot of information and therefore complexity.
That's the mindset that the author is trying to counter.
https://learn.microsoft.com/en-us/dotnet/csharp/language-ref...
I find many syntactical patterns that are considered elegant to be the opposite, and not as clear as mathematics, actually. For example, the the ternary operator mentioned in the article `return n % 2 === 0 ?'Even' : 'Odd;` feels very backwards to my human brain. It's better suited for the compiler to process the syntax tree rather than a human. A human mathematician would do something like this:
Which is super clear.I'm not sure it's realistic to expect to be able to type a mathematical expression using ascii more clearly than you can write it by hand (or implement using special unicode characters).
(Also, see my comment about .editorconfig: https://news.ycombinator.com/item?id=43333011. It helps reduce discussions about style minutia in pull requests.)
It is such a gift to be able to "lock in" a variable's meaning exactly once while reading a given method, and to hold it constant while reasoning about the rest of the method.
Your understanding of the method should monotonically increase from 0% to 100%, without needing to mentally "restart" the method because you messed up what the loop body did to an accumulator on a particular iteration.
This is the real reason why GOTOs are harmful: I don't have a hard time moving my mind's instruction-pointer around a method; I have a hard time knowing the state of mutable variables when GOTOs are in play.
Well, total complexity is not only about moving the instruction pointer given a known starting point. Look at it from the callee’s pov instead of the call site. If someone can jump to a line, you can’t backtrack and see what happened before, because it could have come from anywhere. Ie you needed global program analysis, instead of local.
If mutability were the true source of goto complexity then if-statements and for loops have the same issue. While I agree mutability and state directly causes complexity, I think goto was in a completely different (and harmful) category.
Deleted Comment
IME else’s also end up leading to further nesting and evaluating edge cases or variables beyond the immediate scope of happy path (carrying all that context!).
There are situations where I allow else, they tend to have side effects, but usually I refactor until I get rid of it because it'll come out clearer than it was. Commonly something rather convoluted turns into a sequence of guards where execution can bail ordered based on importance or execution cost. It isolates the actual function/method logic from the exit conditions.
In idiomatic rust, return isn't used except for exceptional cases that break the control flow of the method and the second example is more commonly seen without return statements at all. Idiomatic python also typically early exits in the beginning with a return on invalid parameters or state with a tail position return being the usual actual return value. Because of these conventional practices, breaking the exhaustive if-else control structure makes the indented return appear exceptional (like an invalidity). If you follow these conventions than naturally the return statement begins to appear redundant except in the break-control-flow cases and the choice of the rust convention begins to make sense: in all languages return is a statement equivalent to break.
This is the problem with formatting rules. A codebase needs to have consistent style, even though that might mean nobody is fully happy with it.
I for example can not stand semicolons in JavaScript. It is just a visual clutter that is completely redundant, and yet some people really want it there.
If any of the clauses are much longer, the first option reads a lot better if it can be a guard cause that returns very quick.
If neither options are short I'd argue they should be pushed away into scoped and named blocks (e.g. a function) and we're back to either a ternary operation or a guard like clause.
For two (or more) equally valid, I prefer keeping same nesting.
It's fine if the data model is kept somewhat "atomic" and devs are diligent about actually declaring and documenting types (on my own projects, I'm super diligent about this).
But once types start deriving from types using utility functions and then devs slack and fall back to type inference (because they skip an explicit type), it really starts to unravel because it's very hard to trace fields back to their origin in a _deep_ stack (like 4-5 levels of type indirection; some inferred, some explicit, some derived, some fields get aliased...).
Versus: Very subtle, but for large data models with deep call stacks, the latter is completely unusable and absolutely maddening.I've seen plenty of regressions where someone added a new condition to a function and then returned a slightly different type than other branches did, and it broke things
However, I don't think there is much value in putting types on variable declarations
In your example,
`const checkedDoggos = checkDogs([])` is good. Just let checkedDoggos inherit the type from the function
I have a codebase I'm working on where the linter enforces
`const checkedDoggos: DogBreedAndSize[] = checkDogs([])`
It is very silly and doesn't add much value imo
I hate how a lot of focus on "readability" is on micro-readability, which then tends to encourage highly fragmented code under the highly misguided assumption that micro-readability is more important than macro-readability. The dogma-culting around this then breeds plenty of programmers who can't see the forest for the trees and end up creating grossly inefficient code and/or have difficulty with debugging.
APL-family languages are at the other extreme, although I suspect the actual optimum is somewhere in the middle and highly dependent on the individual.
.Net culture, especially with "clean architecture" is shocking for this, you go to modify a feature or troubleshoot and things are spread across 4 layers and 15 files, some that are > 60% keywords.
I don’t have an answer of where the cutoff is but I'll generally take 1 longer function that's otherwise neat and following the other recommendations outlined that I can read sequentially instead of scrolling up and down every 5 lines because it's so fragmented. same can be said for types/classes too, that 4 value enum used only for this DTO does not need to be in another file!
> Prefer to not use language-specific operators or syntactic sugars, since additional constructs are a tax on the reader.
I don't think this follows from the metric. If a function contains three distinct operators, a language-specific operator that replaces all three of them in one go would reduce the "effort" of function. It's highly scenario-specific.
> Chaining together map/reduce/filter and other functional programming constructs (lambdas, iterators, comprehensions) may be concise, but long/multiple chains hurt readability
I don't think this follows either. One effect of these constructs when used right is that they replace other operators and reduce the "volume". Again this can go both ways.
> ...case in point, these code snippets aren’t actually equivalent!
That's a very language-specific diagnosis, and arguably points at hard-to-read language design in JS. The snippet otherwise doesn't look like JS, but I'm not aware of another language for which this would apply. Indeed it is also commonly known as a "null-safe operator", because most languages don't have separate "null" and "undefined".
> variable shadowing is terrible
> long liveness durations force the reader to do keep more possible variables and variables in their head.
These can arguably be contradictory, and that is why I am a huge fan of variable shadowing in some contexts: By shadowing a variable you remove the previous instance from scope, rather than keeping both available.
[1] https://marketplace.visualstudio.com/items?itemName=fabiospa...