What makes code hard to read: Visual patterns of complexity (2023)

> Chaining together map/reduce/filter and other functional programming constructs (lambdas, iterators, comprehensions) may be concise, but long/multiple chains hurt readability

This is not at all implied by anything else in the article. This feels like a common "I'm unfamiliar with it so it's bad" gripe that the author just sneaked in. Once you become a little familiar with it, it's usually far easier to both read and write than any of the alternatives. I challenge anyone to come up with a more readable example of this:

    var authorsOfLongBooks = books
        .filter(book => book.pageCount > 1000)
        .map(book => book.author)
        .distinct()

By almost any complexity metric, including his, this code is going to beat the snot out of any other way of doing this. Please, learn just the basics of functional programming. You don't need to be able to explain what a Monad is (I barely can). But you should be familiar enough that you stop randomly badmouthing map and filter like you have some sort of anti-functional-programming Tourette's syndrome.

seeinglogic · a year ago

This comment seems unnecessarily mean-spirited... perhaps I just feel that way because I'm the person on the other end of it!

I agree the code you have there is very readable, but it's not really an example of what that sentence you quoted is referencing... However I didn't spell out exactly what I meant, so please allow me to clarify.

For me, roughly 5 calls in a chain is where things begin to become harder to read, which is the length of the example I used.

For the meaning of "multiple", I intended that to mean if there are nested chains or if the type being operated on changes, that can slow down the rate of reading for me.

Functional programming constructs can be very elegant, but it's possible to go overboard :)

atoav · a year ago

To me the functional style is much more easy to parse as well. Maybe the lesson is that familiarity can be highly subjective.

I for example prefer a well chosen one-liner list comprehension in python over a loop with temporary variables and nested if statements most of the time. That is because usually people who use the list comprehension do not program it with side effects, so I know this block of code, once understood stands for itself.

The same is true for the builder style code. I just need to know what each step does and I know what comes out in the end. I even know that the object that was set up might become relevant later.

With the traditional imperative style that introduces intermediate variables I might infer that those are just temporary, but I can't be sure until I read on, keeping those variables in my head. Leaving me in the end with many more possible ways future code could play out. The intermediate variables have the benefit of clarifying steps, but you can have that with a builder pattern too if the interface is well-chosen (or if you add comments).

This is why in an imperative style variables that are never used should be marked (e.g. one convention is a underscore-prefix like _temporaryvalue — a language like Rust would even enforce this via compiler). But guess what: to a person unfamilar with that convention this increases mental complexity ("What is that weird name?"), while it factually should reduce it ("I don't have to keep that variable in the brain head as it won't matter in the future").

In the end many things boil down to familiarity. For example in electronics many people prefer to write a 4.7 kΩ as 4k7 instead, as it prevents you from accidentally overlooking the decimal point and making an accidental off-by-a-magnitude-error. This was particularly important in the golden age of the photocopier as you can imagine. However show that to a beginner and they will wonder what that is supposed to mean. Familiarity is subjective and every expert was once a beginner coming from a different world, where different things are familiar.

Something being familiar to a beginner (or someone who learned a particular way of doing X) is valuable, but it is not necessarily an objective measure of how well suited that representation is for a particular task.

p2edwards · a year ago

Agreed —

seeinglogic's article made me think of a 3rd option:

1. Sorta long functional chain where the type changes partway through 2. Use temp variables 3. (New option) Use comments

(Here's funcA from seeinglogic's article, but I added 3 comments)

    function funcC(graph) {
      return 
        // target node
        graph.nodes(`node[name = ${name}]`)
          // neighbor nodes
          .connected()
          .nodes()
          // visible names
          .not('.hidden')
          .data('name');
     }

Compare to funcB which uses temp variables:

    function funcB(graph) {
      const targetNode = graph.nodes(`node[name = ${name}]`)
      const neighborNodes = targetNode.connected().nodes();
      const visibleNames = neighborNodes.not('.hidden').data('name')

      return visibleNames;
    }

For me the commented version is easier to read and audit and it also feels safer for some reason, but I'm not how subjective that is.

feoren · a year ago

The dig on chains of map/reduce/filter was listed as a "Halstead Complexity Takeaway", and seemed to come out of the blue, unjustified by any of the points made about Halstead complexity. In fact in your later funcA vs. funcB example, funcB would seem to have higher Halstead complexity due to its additional variables (depending on whether they count as additional "operands" or not). In general, long chains of functions seem like they'd have lower Halstead complexity.

The "anti-functional Tourette's" comment was partly a response to how completely random and unjustified it seemed in that part of the article, and also that this feels like a very common gut reaction to functional programming from people who aren't really willing to give it a try. I'm not only arguing directly against you here, but that attitude at large.

Your funcA vs. funcB example doesn't strike me as "functional" at all. No functions are even passed as arguments. That "fluent" style of long chains has been around in OO languages for a while, independent of functional programming (e.g. see d3.js*, which is definitely not the oldest). Sure, breaking long "fluent" chains up with intermediate variables can sometimes help readability. I just don't really get how any of this is the fault of functional programming.

I think part of the reason funcB seems so much more readable is that neither function's name explains what it's trying to do, so you go from 0 useful names to 3. If the function was called "getNamesOfVisibleNeighbors" it'd already close the readability gap a lot. Of course if it were called that, it'd be more clear that it might be just trying to do too much at once.

I view the "fluent" style as essentially embedding a DSL inside the host language. How readable it is depends a lot on how clear the DSL itself is. Your examples benefit from additional explanation partly because the DSL just seems rather inscrutable and idiosyncratic. Is it really clear what ".data()" is supposed to do? Sure, you can learn it, but you're learning an idiosyncrasy of that one library, not an agreed-upon language. And why do we need ".nodes()" after ".connected()"? What else can be connected to a node in a graph other than other nodes? Why do you need to repeat the word "node" in a string inside "graph.nodes()"? Why does a function with the plural "nodes" get assigned to a singular variable? As an example of how confusing this DSL is, you've claimed to find "visibleNames", but it looks to me like you've actually found the names of visible neighborNodes. It's not the names that are not(.hidden), it's the nodes, right? Consider this:

    function getVisibleNeighborNames(graph) {
        return graph
            .nodeByName(name)
            .connectedNodes()
            .filter(node => !node.isHidden)
            .map(node => node.name)
    }

Note how much clearer ".filter(node => !node.isHidden)" is than ".not('.hidden')", and ".map(node => node.name)" versus ".data('name')". It's much harder to get confused about whether it's the node or the name that's hidden, etc.

Getting the DSL right is really hard, which only increases the benefit of using things like "map" and "filter" which everyone immediately understands, and which have no extrinsic complexity at all.

You could argue that it's somehow "invalid" to change the DSL, but my point is that if you're using the wrong tool for the job to begin with, then any further discussion of readability is in some sense moot. If you're doing a lot of logic on graphs, you should be dealing with a graph representation, not CSS classes and HTML attributes. Then the long chains are not an issue at all, because they read like a DSL in the actual domain you're working in.

*Sidenote: I hate d3's standard style, for some of the same reasons you mention, but mainly because "fluent" chains should never be mutating their operand.

ninetyninenine · a year ago

>For me, roughly 5 calls in a chain is where things begin to become harder to read, which is the length of the example I used.

This isn't just about readability. Chaining or FP is structurally more sound. It is the more proper way to code from a architectural and structural pattern perspective.

     given an array of numbers

   1. I want to add 5 to all numbers
   2. I want to convert to string
   3. I want to concat hello
   4. I want to create a reduced comma seperated string
   5. I want to capitalize all letters in the string.

This is what a for loop would look like:

   // assume x is the array
   acc = ""

   for(var i = 0, i < x.length; x++) {
       value = x[i] + 5
       value += 5
       stringValue = str(value).concat(hello)
       acc += stringValue + ","
   }

   for (var i = 0, i < acc.length; i++) {
       acc[i] = capitalLetter(acc[i])
   }

FP:

    addFive(x) = [i + 5 for i in x]
    toString(x) = [str(i) for i in x]
    concatHello = [i + "hello" for i in x]
    reduceStrings(x) = reduce((i, acc) = acc + "," + i, x)
    capitalize(x) = ([capitalLetter(i) for i in x]).toString()

You have 5 steps. With FP all 5 steps are reuseable. With Procedural it is not.

Mind you that I know you're thinking about chaining. Chaining is eqivalent to inlining multiple operations together. So for example in that case

     x.map(...).map(...).map(...).reduce(...).map(...)

     //can be made into
     addFive(x) = x.map(...)
     toString(x)= x.map(...)
     ...

By nature functional is modular so such syntax can easily be extracted into modules with each module given a name. The procedural code cannot do this. It is structurally unsound and tightly coupled.

It's not about going overboard here. The FP simply needs to be formatted to be readable, but it is the MORE proper way to code to make your code modular general and decoupled.

jltsiren · a year ago

Your example is a conceptually simple filter on a single list of items. But once the chain grows too long, the conditions become too complex, and there are too many lists/variables involved, it becomes impossible understand everything at once.

In a procedural loop, you can assign an intermediate result to a variable. By giving it a name, you can forget the processing you have done so far and focus on the next steps.

stouset · a year ago

You don't ever need to "understand everything at once". You can read each stanza linearly. The for loop style is the approach where everything often needs to be understood all at once since the logic is interspersed throughout the entire body.

reubenmorais · a year ago

In a practical example you'd create a named intermediate type which becomes a new base for reasoning. Once you convinced yourself that the first part of the chain responsible for creating that type (or a collection of it) is correct, you can forget it and free up working memory to move on to the next part. The pure nature of the steps also makes them trivially testable as you can just call them individually with easy to construct values.

kccqzy · a year ago

If you assign an intermediate result to a variable in a procedural loop, you can also assign intermediate results of parts of this chain to variables.

titzer · a year ago

SELECT DISTINCT author FROM books WHERE pageCount > 1000;

101011 · a year ago

In fairness, if this was in a relational data store, the same code as above would probably look more like...

SELECT DISTINCT authors.some_field FROM books JOIN authors ON books.author_id = authors.author_id WHERE books.pageCount > 1000

And if you wanted to grab the entire authors record (like the code does) you'd probably need some more complexity in there:

SELECT * FROM authors WHERE author_id IN ( SELECT DISTINCT authors.author_id FROM books JOIN authors ON books.author_id = authors.author_id WHERE books.pageCount > 1000 )

YesBox · a year ago

Scrolled to find the SQL. Such an elegant, powerful language. Really happy I chose SQLite for my game/project.

beryilma · a year ago

This is 5 times more readable than FP example above for the same computation. The FP example uses variable book(s) five times, where using it once was sufficient for SQL. Perhaps FP languages could have learned something from SQL...

odyssey7 · a year ago

Notably this example is declarative, the original is functional, and neither is imperative.

__mharrison__ · a year ago

Folks don't seem to have a problem when SQL does it. Only when code like Pandas does it...

RedNifre · a year ago

While I think your example is fine, I think the complaint was more about very long chains. Personally, I like to break them up to give intermediate results names, kinda like using variable names as comments:

  var longBooks = books.filter(book => book.pageCount > 1000)

  var authorsOfLongBooks = longBooks.map(book => book.author).distinct()

matejn · a year ago

I like the SQL solutions people posted. But what about this one in Prolog?

  ?- setof(Author, Book^Pages^(book_author(Book, Author), book_pages(Book, Pages), Pages > 1000), Authors).

Depending on the structure of the Prolog database, it could be shorter:

  ?- setof(Author, Pages^(book(_, Author, Pages), Pages > 1000), Authors).

dsego · a year ago

That's not a long chain. It doesn't even have a reduce, try nesting a few reducers and see how you like it.

aaronbrethorst · a year ago

What is "long"?

MathMonkeyMan · a year ago

> I challenge anyone [...]

    select distinct author from book where pageCount > 1000;

Deleted Comment

elliottkember · a year ago

Good example actually. You started with a books array, and changed the type to authors half-way through.

To know the return type of the chain, I have to read through it and get to the end of each line.

A longBooks array, and map(longBooks, ‘author’) wouldn’t be much longer, but would involve more distinct and meaningful phrases.

I used to love doing chains! I used lodash all the time for things like this. It’s fun to write code this way. But now I see that it’s just a one-liner with line breaks.

frankharrison · a year ago

One core reason chaining can be bad is robustness; another longevity/maintenance.

Specifically around type-safety, that is knowing that the chained type is what you expect and communicating that expectation to the person who is reading the code without them needing to know the wider context of both the chained-API nor the function the chain resides in. In the context of this article, that means more complexity, and therefore less readability.

I feel this is important because I have worked on many legacy code bases where bugs were found where chains were not behaving as expected, normally after attrition in some other part of the code base, and then you have to become a detective to work out the original intent.

For readability chains are bad, because they can lie about their intent, especially if there’s various semantics that can be swapped. But, like any industry or code base, if their use is consistent, and the api mature/stable, they can be powerful and fast, if.

throwaway894345 · a year ago

FWIW, I'm plenty familiar with functional programming and iterator chains, and I still think for loops often beat them--not only from a "visual noise" perspective, but because complex iterator chains are harder to read than equivalent for loops (particularly when you have to deal with errors-as-values and short circuiting or other patterns) and for simple tasks iterator chains might be marginally simpler but the absolute complexity of the task is so low that a for loop is fine.

> But you should be familiar enough that you stop randomly badmouthing map and filter like you have some sort of anti-functional-programming Tourette's syndrome.

I've been moderated for saying much tamer, FYI.

usrusr · a year ago

Fully agree on the easier to read part. But despite all "coffee is read more often than written" arguments, I see a lot of merit in the functional way. There are a hundred ways to write the loop slightly wrong, or with intended behavior slightly different from the regular. The functional variant: not so much. A small variation from the standard loop in functional style is visible. Very visible. Unintended ones simple won't happen, and the intended ones are an eyesore. An eyesore impossible to miss reading, whereas a subtle variation of the imperative loop is exactly that, subtle. Easy to miss. Readability advantage functional.

In my book, keeping simple things simple and the not simple things not simple beats simplicity everywhere. This is actually what I consider a big drawback of functional style: often the not-simple parts are way too condensed, almost indistinguishable from trivialities. But in the loop scenario it's often the reverse.

My happy place, when writing, would be an environment that has enough AST-level understanding to transform between both styles.

(other advantages of functional style: skills and habits transfer to both async and parallel, the imperative loop: not so much)

eitland · a year ago

This is easy to read, but in reality I have found things to typically be a bit less straightforward.

Three things typically happens:

1. people who like these chains really like them. And I've seen multiple "one liner expressions" that was composed of several statements anded or ored together needing one or two line breaks.

2. when it breaks (and it does in the real world), debugging is a mess. At least last time I had to deal with it was no good way to put breakpoints in there. Maybe things have changed recently, but typically one had to rewrite it to classic programming and then debug it.

3. It trips up a lot of otherwise good programmers who hasn't seen it before.

agent327 · a year ago

More readable? How about this:

SELECT DISTINCT authors FROM books WHERE page_count > 1000;

namaria · a year ago

Yeah but how do you get that data written to a structure database so you get to do that?

Archelaos · a year ago

I think, an explicit type would make it even easier to grok:

  ISet<Author> authorsOfLongBooks =
    books
    .filter(book => book.pageCount > 1000)
    .map(book => book.author)
    .distinct()
    .toHashset()

Or whatever the equivalent for ISet<Author> is in the respective language. Or IReadonlySet<Author> if the set should be immutable.

williamcotton · a year ago

And in a real functional language like F#...

  let authorsOfLongBooks = 
    books
    |> Seq.filter (fun book -> book.pageCount > 1000)
    |> Seq.map (fun book -> book.author)
    |> Seq.distinct

...you can set breakpoints anywhere in the pipeline!

hinkley · a year ago

I wouldn’t call 3 long. Which means you’ve picked the softball counterexample. If you were trying to play devil’s advocate, chose a longer legitimate one and show how a loop or other construct would make it better.

Three dots is just a random Tuesday.

wegfawefgawefg · a year ago

in the most popular languages this way of programming is hurt by the syntax.

this could be something more like

distinct

  filter books .pageCount >1000

  .author

i think fp looks pretty terrible in js, rust, python, etc

porridgeraisin · a year ago

When the filter/mapper becomes slightly more involved as it basically always is in real life code, the regular imperative approach is much nicer.

Spivak · a year ago

    authors_of_long_books = set()

    for book in books:
        if len(book.pages) > 1000:
            authors_of_long_books.add(book.author)

    return authors_of_long_books

You are told explicitly at the beginning what the type of the result will be, you see that it's a single pass over books and that we're matching based on page count. There are no intermediate results to think about and no function call overhead.

When you read it out loud it's also it's natural, clear, and in the right order— "for each book if the book has more than 1000 pages add it to the set."

tremon · a year ago

also it's natural, clear, and in the right order

That isn't natural to anyone who is not intimately familiar with procedural programming. The language-natural phrasing would be "which of these books have more than thousand pages? Can you give me their authors?" -- which maps much closer to the parent's linq query than to your code.

syklemil · a year ago

fwiw, once Python's introduced there's the third option on the table, comprehensions, which will also be suggested by linters to avoid lambdas:

    authors_of_long_books: set[Author] = {book.author for book in books if book.page_count > 1000}

These are somewhat contentious as they can get overly complex, but for this case it should be small & clear enough for any Python programmer.

mrkeen · a year ago

Awesome, giving it a quick scan,

  authors_of_long_books = set()

Now I know that authors_of_long_books is the empty set. Do I need to bother reading the rest?

tvier · a year ago

Much of both sides of this argument are opinion, but wrt this comment.

> ... no function call overhead.

This code has more function calls. O(n) vs 3 for the original

feoren · a year ago

> You are told explicitly at the beginning what the type of the result will be

I would argue that's a downside: you have to pick the appropriate data structure beforehand here, whereas .distinct() picks the data structure for you. If, in the future, someone comes up with a better way of producing a distinct set of things, the functional code gets that for free, but this code is locked into a particular way of doing things. Also, .distinct() tells you explicitly what you want, whereas the intention of set() is not as immediately obvious.

> There are no intermediate results to think about

I could argue that there aren't really intermediate results in my example either, depending on how you think about it. Are there intermediate results in the SQL query "SELECT DISTINCT Author FROM Books WHERE Books.PageCount > 1000"? Because that's very similar to how I mentally model the functional chain.

There are also intermediate results, or at least intermediate state, in your code: at any point in the loop, your set is in an intermediate state. It's not a big deal there either though: I'd argue you don't really think about that state either.

> and no function call overhead

That's entirely a language-specific thing, and volatile: new versions of a language may change how any of this stuff is implemented under the hood. It could be that "for ... in" happens to be a relatively expensive construct in some languages. You're probably right that the imperative code is slightly faster in most languages today, and if it has been shown via performance analysis that this particular code is a bottleneck, it makes sense to sacrifice readability in favor of performance. But it is a sacrifice in readability, and the current debate is over which is more readable in the first place.

> a single pass over books

Another detail that may or may not be true, and probably doesn't matter. The overhead of different forms of loops is just not what's determining the performance of almost any modern application. Also, my example could be a single pass if those methods were implemented in a lazy, "query builder" form instead of an immediately-evaluated form.

In fact, whether this query should be immediately evaluated is not necessarily this function's decision. It's nice to be able to write code that doesn't care about that. My example works the same for a wide variety of things that "books" could be, and the strategy to get the answer can be different depending on what it is. It's possible the result of this code is exactly the SQL I mentioned earlier, rather than an in-memory set. There are lots of benefits to saying what you want, instead of specifying exactly how you want it.

yongjik · a year ago

Maybe it's because I'm not familiar with such style, but I don't like how the code hides operational details. That is, if `books` contains one billion books, and the final result should contain about a hundred authors, how much extra memory does this use for intermediate results?

mrkeen · a year ago

The best way to kick the tyres on this kind of question is to plug in something literally infinite. That way if you arrive at an answer you're probably doing something right with regard to space and time usage.

For example, use all the prime numbers as an expression in your chain.

    import Data.Function
    import Data.Numbers.Primes

    main = do

        let result :: [Int] = primes
                            & filter (startingWithDigit '5')
                            & asPairs
                            & map pairSum
                            & drop 100000
                            & take 10

        print result

    asPairs xs = zip xs (tail xs)
    pairSum (a, b) = a + b
    startingWithDigit d x = d == head (show x)

> [100960734,100960764,100960792,100960800,100960812]

> 3 MiB total memory in use (0 MB lost due to fragmentation)

yxhuvud · a year ago

This is a valid concern I also reacted a little bit on. One thing to note though is that it is often possible to tell such chains to be lazy and only collect the end result at the end without ever generating any intermediary arrays.

Which require the author to actually have an idea how big the numbers are, but that is very often the case regardless of how you write your code.

patrick451 · a year ago

Sometimes I write code like this. Then I delete it and replace it with a for loop, because a loop is just easier to understand.

This functional style is what I call write only code, because the only person who can understand it is the one who wrote it. Pandas loves this kind of method chaining, and it's one of the chief reasons pandas code is hard to read.

throwA29B · a year ago

Chaining calls is an anti-pattern. Not only this is needless duplication of ye olde imperative statements sequence it also makes debugging, modifying ("oh I need to call some function in the middle of the chain, ugh"), and understanding harder for superficial benefit of it looking "cool".

It actively hurts maintainability, please stop using it.

There is a (large, I believe) aspect of good code that is fundamentally qualitative & almost literary. This annoys a lot of computer programmers (and academics) who are inclined to the mathematical mindset and want quantitative answers instead.

I love dostoyevsky and wodehouse, both wrote very well, but also very differently. While I don't think coding is quite that open a playing field, I have worked on good code bases that feel very different qualitatively. It often takes me a while to "get" the style of a code base, just as a new author make take a while for me to get.

louthy · a year ago

I 100% agree with this. One of the best compliments I ever got (regarding programming) was from one of my principal engineers who said something along the lines of "your code reads like a story". He meant he could open a code file I had written, read from top to bottom and follow the 'narrative' in an easy way, because of how I'd ordered functions, but also how I created declarative implementations that would 'talk' to the reader.

I follow the pure functional programming paradigm which I think lends itself to this more narrative style. The functions are self contained in that their dependencies/inputs are the arguments provided or other pure functions, and the outputs are entirely in the return type.

This makes it incredibly easy to walk a reader through the complexity step-by-step (whereas other paradigms might have other complexities, like hidden state, for example). So, ironically, the most mathematically precise programming paradigm is also the best for the more narrative style (IMHO of course!)

DidYaWipe · a year ago

I had a similar experience. I was a lead on a project where the client sent a functional expert to literally (at times) watch over my shoulder as I worked. He got very frustrated after a few weeks, seeing little in the way of code being laid down. He even complained to the project manager. That's because this was a complicated manufacturing system, and I was absorbing the necessary rules for it and designing it... a process that involved mostly sitting and thinking.

When I decided on the final design and basically barfed all the code out in a matter of days, I walked this guy (a non-programmer) through the code. He then wrote my manager a letter declaring it to be the "most beautiful code he had ever seen." I still have the Post-It she left in my cube telling me that.

I have little tolerance for untidy code, and also overly-clever syntax that wastes the reader's time trying to unravel it.

And now we have languages building more inconsistent and obscure syntax in as special-case options, wasting more time. Specifically I'm thinking about Swift; where, if the last parameter in a function call is a closure, it's a "trailing" closure and you can just ignore the function signature and plop the whole closure right there AFTER the closing parenthesis. Why?https://www.hackingwithswift.com/sixty/6/5/trailing-closure-...

This is just one example, and yeah... you can get used to it. But in this example, the language has undermined PARENTHESES, a notation that is almost universally understood to enclose things. When something that basic is out the window, you're dealing with language designers who lack an appreciation for human communication.

qwertygnu · a year ago

> The functions are self contained in that their dependencies/inputs are the arguments provided or other pure functions and the outputs are entirely in the return type.

Is this just a fancy way of saying static functions?

hinkley · a year ago

There’s a difference between simplifying a concept and stating it plainly.

I use this analogy a lot. Code can be like a novel, a short story, or a poem. A short story has to get to the point pretty quickly. A poem has to be even more so, but it relies either on shared context or extensive unpacking to be understood. It’s beautiful but not functional.

And there are a bunch of us short story writers who just want to get to the fucking point with a little bit of artistic flair, surrounded by a bunch of loud novel and mystery writers arguing with the loudest poets over which is right when they are both wrong. And then there’s that asshole over there writing haikus all the fucking time and expecting the rest of us to be impressed. The poets are rightfully intimidated but nobody else wants to deal with his bullshit.

zwnow · a year ago

I consider code bad if it takes more then 5 seconds to read and understand the high level goal of a function.

Doesn't matter how it looks. If its not possible to understand what a function accomplishes within a reasonable amount of time (without requiring hours upon hours of development experience), it's simply bad.

jacobr1 · a year ago

There is a call-stack depth problem here that is specific to codebases though. For one familiar with the the conventions, key data abstractions (not just data model but convention of how models are structured and relate) and key code abstractions, a well formed function is easy to understand. But someone relatively new to the codebase will need to take a bunch of time switching between levels to know what can be assumed about the state or control flow of the system in the context of when that function/subroutine is running. Better codebases avoid side-effects, but even with good separation there, non-trivial changes require strong reasoning about where to make changes in the system to avoid introducing side-effects and not just passing extra state or around all over the place.

So, I'd take "good architecture" with ok and above readability, over excellent readability but "poor architecture" any day. Where architecture in this context means the broader code structure of the whole project.

fasbiner · a year ago

Sounds like you are content to limit yourself to problems that do not contain more irreducible complexity or require more developer context than what fits within five seconds of comprehension.

That's a good rule for straightforward CRUD apps and single-purpose backend systems, but as a universal declaration, "it is simply bad" is an ex cathedra metaphysical claim from someone who has mistaken their home village for the entirety of the universe.

ajross · a year ago

> I consider code bad if it takes more then 5 seconds to read and understand the high level goal of a function.

That's something that's possible only for fairly trivial logic, though. Real code needs to be built on an internal "language" reflecting its invariants and data model and that's not something you can see with a microscope.

IMHO obsessive attention to microscope qualities (endless style nitpicking in code review, demands to run clang-format or whatever the tool du jour is on all submissions, style guides that limit function length, etc...) hurts and doesn't help. Good code, as the grandparent points out, is a heuristic quality and not one well-defined by rules like yours.

hinkley · a year ago

Code that looks like it has a bug in it but doesn’t will draw the eye over, and over, and over again when fishing for how regressions or bugs got into the code. This is the real cost of code smells. At some point it’s cheaper for me to clean up your mess than to keep walking past it every day. But I’m going to hate you a little bit every time I do.

callc · a year ago

Generally agree.

Consider reading kernel or driver code. These areas have a huge amount of prerequisite knowledge that - I argue - makes it OK to violate the “understand at a glance” rule of thumb.

andybp85 · a year ago

for the codebase i work on, i made a rule that "functions do what the name says and nothing else". this way if the function does too much, hopefully you feel dumb typing it and realize you should break it up.

sunrunner · a year ago

Does this apply to all domains and all 'kinds' of code?

I feel like there's a fundamental difference in the information density between code that, for example, defines some kind of data structure (introducing a new 'shape' of data into an application) versus code that implements a known algorithm that might appear short in line length but carries a lot of information and therefore complexity.

jayd16 · a year ago

So it should take 5 minutes whether it's your language or choice or the assembly it compiles to? Or does it matter how it looks in _that_ case?

bad_haircut72 · a year ago

Do you think you would understand every function in the doom codebase in under 5 seconds? Is this bad proframming then?

intrasight · a year ago

> Doesn't matter how it looks.

That's the mindset that the author is trying to counter.

jimmaswell · a year ago

This is what xmldoc/jsdoc/etc are for. If it's not 100% obvious from the name, put a summary of the function's assumptions, side effects, output, and possibly an example in the comment-doc. If you do this right, the next programmer will never have to read your source at all (or even navigate to your file! They'll hover over a method call or find it in the dot-autocomplete and see a little tootip with this documentation in it, and know all they need to know). It's an incredible thing when it works. It's a little bit more effort but I don't accept the FUD around "comments become out of date immediately because the code will change" etc. - that should be part of code review.

https://learn.microsoft.com/en-us/dotnet/csharp/language-ref...

freetonik · a year ago

>This annoys a lot of computer programmers (and academics) who are inclined to the mathematical mindset and want quantitative answers instead.

I find many syntactical patterns that are considered elegant to be the opposite, and not as clear as mathematics, actually. For example, the the ternary operator mentioned in the article `return n % 2 === 0 ?'Even' : 'Odd;` feels very backwards to my human brain. It's better suited for the compiler to process the syntax tree rather than a human. A human mathematician would do something like this:

         ⎧  'Even' n mod 2 = 0
  f(n) = ⎨
         ⎩  'Odd'  n mod 2 ≠ 0

Which is super clear.

pc86 · a year ago

Well of course if you have the freedom to write a mathematical expression you're going to be able to present it in a way that is clearer than if you have to type monospace characters into a text editor.

I'm not sure it's realistic to expect to be able to type a mathematical expression using ascii more clearly than you can write it by hand (or implement using special unicode characters).

Timwi · a year ago

I understand that you might find the mathematical notation clearer but I think it's presumptuous of you to speak on behalf of all humans, or even all human mathematicians. I'm a mathematics graduate and I find the conditional operator more readable in a program because it corresponds to what the program actually does (it checks the condition first); but I also recognize that the two notations have exactly the same information content and only differ superficially in syntax, making it entirely a matter of familiarity.

gwbas1c · a year ago

This is why code reviews are so critical: They help keep a consistent style while onboarding new team members, and they help a team keep its style (reasonably) consistent.

(Also, see my comment about .editorconfig: https://news.ycombinator.com/item?id=43333011. It helps reduce discussions about style minutia in pull requests.)

WillAdams · a year ago

Arguably, this is why Literate Programming (see my comment elsethread) didn't take off.

User23 · a year ago

Mathematicians have recognized the importance of elegance for millennia.

type Dog = { breed: string size: "lg" | "md" | "sm" // ... } type DogBreedAndSize = Pick<Dog, "breed" | "size"> function checkDogs(dogs: Dog[]) : DogBreedAndSize[] { return dogs.map(d => /* ... */) } const checkedDoggos = checkDogs([])

//In vscode settings.json: "highlight.regexes": { "((?:this\\.)?(?:_)?logger(?:\\?)?.(debug|error|info|warn)[^\\)]*\\)\\;)": { "regexFlags": "gmi", "decorations": [{ "opacity": "0.4" }] } },