Good programmers worry about data structures and their relationships

Data structures are not the same thing as types. Data structures are bit patterns and references to other bit patterns (pointers or relationships). Types (as they are used in programming languages) place some constraints on those bit patterns, but can also encode many other language features.

Creating an elaborate type hierarchy with unnecessary abstractions is not what is meant by "worrying about data structures", and that tendency is one of the most common failure modes for otherwise smart engineers.

Arelius · a year ago

I think this is a subtle and important point. Types are a potential useful tool to restrain and and specify the schema of data structures. But concern for types is very different then concern for the data structures.

rbanffy · a year ago

In OOP types and data structures have a much less visible border between them.

phkahler · a year ago

Types are data structures that the language is aware of. This allows the tooling to do checks that they can't do on plain old data structures.

jvanderbot · a year ago

Data structures are algorithms at rest. They shuffle and move things each operation, but mostly sit still, like a turing machine that people only crank once in a while.

Types are the bits on disc.

atomicnature · a year ago

Good catch.

Equating data structures to types is an over simplification that misses the core point.

I think the original call here is to simply think harder about the problem and avoid picking structures that'll burn you later.

For example, take Unix pipes, see how far they've traveled, how many domains, how many use cases. It's a brilliant way to visualize system building while respecting the constraints of minds and machines.

And it took Ken and others quite a while to realize something like pipes could make sense in Unix. It was not an insight easily obtained but required a bit of hustle and followup and obsession with finding the right building blocks for a system.

EVa5I7bHFq9mnYK · a year ago

The same data structure can be assigned different types, that's what typedef operator in Pascal does.

Linus always has a great way of summarizing what others might be thinking (nebulously). What's being said in the article is really mirrored in the lost art of DDD, and when I say "lost" I mean that most developers I encounter these days are far more concerned with algorithms and shuttling JSON around than figuring out the domain they're working within and modelling entities and interactions. In modern, AWS-based, designs, this looks like a bunch of poorly reasoned GSIs in DDB, anemic objects, and script-like "service" layers that end up being hack upon hack. Maybe there was an implicit acknowledgement that the domain's context would be well defined enough within the boundaries of a service? A poor assumption, if you ask me.

I don't know where our industry lost design rigor, but it happened; was it in the schools, the interviewing pipeline, lowering of the bar, or all of the above?

mattgreenrocks · a year ago

I’d argue software design has never been taken seriously by industry. It’s always cast in negative terms, associated with individuals seen as politically wrong/irrelevant and brings out ton of commenters who can’t wait to tell us about this one time somebody did something wrong, therefore it’s all bad. Worse, design commits the cardinal sin of not being easily automated. Because of this, people cargo cult designs that tools impose on them, and chafe at the idea that they should think further on what they’re doing. People really want to outsource this thinking to The Experts.

It doesn’t help that isn’t really taught, but is something you self-teach over years, it is seen as less real than code (ergo, not as important). All of these beliefs are ultimately self-limiting and keep you at advanced beginner stage in terms of what you can build, however.

Basically, programmers collectively choose to keep the bar as low as possible and almost have a crab-like mentality on this subject.

wredue · a year ago

I can see a swing finally starting. It isn’t “huge” by any stretch, but at the same time

“deVElOpErS aRe MoRE EXpEnSivE tHaN HArDwaRE”

Commenters are no longer just given free internet points. This is encouraging as these people controlled the narrative around spending time on thinking things through and what types of technical debt you should accept for like 20 YEARS.

I think maybe people are finally sick of having 128 gigs of ram being used by a single 4kb text file.

Deleted Comment

cratermoon · a year ago

> most developers I encounter these days are far more concerned with algorithms and shuttling JSON around than figuring out the domain they're working within and modelling entities and interactions

The anemic domain model was identified as an anti-pattern quite a long time ago[1]. It usually shows up along with Primitive Obession[2] and result in a lot of code doing things to primitive types like strings and numbers, with all kinds of validation and checking code all over the place. It can also result in a lot of duplication of code that doesn't look obviously like duplication because it's not syntactically identical, yet it's functionally doing the same thing.

1 https://martinfowler.com/bliki/AnemicDomainModel.html

2 https://wiki.c2.com/?PrimitiveObsession

B-Con · a year ago

The industry predominately rewards writing code, not designing software.

I think the results of bad code aren't as obvious. A bad bridge falls down, bad code has to be... refactored/replaced with more code? It goes from one text file that execs don't understand to a different text file that execs doesn't understand.

And once something works, it becomes canon. Nothing is more permanent than a temporary hack that happens to work perfectly. But 1000 temporary hacks do not a well-engineered system make.

I believe that maturing in software development is focusing on data and relationships over writing code. It's important to be able to turn it into code, but you should turn those into code, not turn code that works into a data model.

krooj · a year ago

> The industry predominately rewards writing code, not designing software.

The sad part of this is that code is absolutely a side-effect of design and conception: without a reason and reasonable approach, code shouldn't exist. I really think that the relative austerity happening in industry right now will shine a light on poor design: if your solution to solving poorly understood spaces was to add yet another layer of indirection in the form of a new "microservice" as the problem space changed over time, it's probably more likely that there was an inherent poor underlying understanding of the domain and lack of planning extensibility in anticipation. Essentially, code (bodies) and compute aren't as "cheap" as they were when money was free, so front-loading intelligent design and actually thinking about your space and it's use-cases becomes more and more important.

rbanffy · a year ago

> The industry predominately rewards writing code, not designing software.

This also stems from most of the code being written at any given moment being to solve problems we already solved before and doing or supporting mundane tasks that are completely uninteresting from the software design point of view.

hirvi74 · a year ago

> anemic objects

I have yet to come across a compelling reason why this is such a taboo. Most DDD functions I have seen also are just verbose getters and setters. Just because a domain entity can contain all the logic doesn't mean it should. For example, if I need to verify if a username exists already, then how do I go about doing that within a domain entity that "cannot" depend on the data access layer? People commonly recommend things like "domain services," which I find antithetical to DDD because now business logic is being spread into multiple areas.

I quite enjoy DDD as a philosophy, but I have the utmost disdain for "Tactical DDD" patterns. I think too many people think Domain-Driven Design == Domain-Driven Implementation. I try to build rich domains where appropriate, which is not in all projects, but I try not to get mired up in the lingo. Is "Name" type a value object or an aggregate root? I couldn't care less. I am more concerned about the bounded contexts than anything else. I will also admit that DDD can sometimes increase the complexity of an application while providing little gains. I wouldn't ever dare say it's a silver-bullet.

I will continue to use DDD going forward, but I can't help but shake this feeling that DDD is just an attempt at conveying, "See? OOP isn't so bad after all, right?" Of which, I am not sure it accomplishes that goal.

arwhatever · a year ago

If you replace the Object-Oriented mechanism for encapsulation with some other mechanism for encapsulation then there's probably no reason for this taboo.

But in 99.999999% of real-world projects, anemic object-oriented code disregards encapsulation completely, and so business logic (the core reason why you're building the software in the first place) gets both duplicated and strewn randomly throughout the entire code project code.

Or in many cases, if the team disregards encapsulation at the type level then they're likely to also disregard encapsulation at the API/service/process level as well.

bell-cot · a year ago

With decades of exponential growth in CPU power, and memory size, and disk space, and network speed, and etc. - the penalties for shit design mostly went away, so you could usually get away with code monkeys writing crap as fast as they could bang on the keyboards.

et-al · a year ago

Looks like that substack just copied a bunch of quotes from this Stack Exchange post:

https://softwareengineering.stackexchange.com/questions/1631...

dang · a year ago

I've changed the URL from https://read.engineerscodex.com/p/good-programmers-worry-abo... now. Urgh spammers.

There was a wave of this substack spam a couple months ago and I suppose this is the same bad actor starting up again.

Sorry about that. Next time I’ll do more research.

kjs3 · a year ago

Didn't even bother to change the order of the quotes. That's bold.

mrkeen · a year ago

And that paragraph at the end came out of nowhere:

> It’s why one of the Senior Engineer (L5) requirements (at least for FAANG) generally involves writing higher-level design docs for more complex systems (which includes driving team planning and building good roadmaps for medium-to-large features).

dswilkerson · a year ago

"Show me your flowcharts [code], and conceal your tables [schema], and I shall continue to be mystified; show me your tables [schema] and I won't usually need your flowcharts [code]: they'll be obvious." -- Fred Brooks, "The Mythical Man Month", ch 9.

sjm · a year ago

I once wrote to John Carmack as a Quake-obsessed kid, asking for any advice he has for an aspiring programmer and if he had any favourite books. To my surprise he wrote back a really thoughtful response, including the following:

"Read The Mythical Man Month. I remember thinking that a book that old can't say anything relevant about software development today, but I was wrong."

datadrivenangel · a year ago

I came here to share this quote because it's so true.

Except when the effort to change the database schema becomes significantly greater than the effort to change the code, and then application developers start abusing the database because it's faster and they have things to do.

alphazard · a year ago

AndrewKemendo · a year ago

It’s so interesting because I started doing professional engineering AFTER doing day to day data and statistical analysis in statistical systems like matlab, R and early Python.

So my view of engineering has always been based on managing two things: functional state and data workflows

After doing software engineering professionally for a decade now I can tell you that:

1. Most “scientific” engineers back to Minsky, Shannon etc… describe the world of computing in terms of state management, data transformation and computing overhead management. All of the big figures and pioneers in software cared A LOT about data and state basically that’s all computing was at the beginning and was expected to be the pattern moving forward

2. There’s absolutely no consistency in what are the foundationally important assumptions in engineering system design that are always true such that everyone does them - and the ones that do are fads at best

3. Business timelines dictate engineering priorities and structures much more than robustness, antifragility, state management etc… in the vast majority of production software

4. Professional organizations like guilds, unions, etc… are almost universally rejected by software engineers. Nobody actually takes IEEE seriously because there’s no downside if you don’t. This ensures there’s no enforcement or self-regulation in engineering practices the same way there are in eg Civil and biomedical engineering. Even then those are barely utilized.

Overall the state of software development is totally divorced from its exceptionally high minded and philosophical roots, and is effectively led by corporations that are priorizing systems that make money for people with money.

So what is “good” has very little to do with what is incentivized

cpeterso · a year ago

“Show me your flowcharts [code] and conceal your tables [data structures], and I shall continue to be mystified. Show me your tables, and I won’t usually need your flowcharts; they’ll be obvious.”

-- Fred Brooks

polygotdomain · a year ago

I think this quote misses that there can (and arguably should) be differences between your persistence model and your actual data structures. I'd argue that keeping things 1:1 with your underlying tables is incredibly restrictive and leads to models that miss out on the expressiveness that's available in modern languages.

wtetzner · a year ago

I think the brackets were simply suggesting that flow charts are analogous to code and tables are analogous to data structures in that quote. Not that your tables and data structures in a concrete system will be the same.

ants_everywhere · a year ago

This is essentially the point of view of functional programming and category theory.

You have some data object whose structure provides constraints on how it can be transformed. And then the program logic is all about the structure-preserving transformations.

The transformations become simpler and easier to reason about, and you're basically left with a graph where the transformations are edges and the structures are nodes. And that's generally easier to reason about than an arbitrary imperative program.

constantcrying · a year ago

>This is essentially the point of view of functional programming and category theory.

No, it isn't. This is the point of every language philosophy, you will find OOP and procedural people arguing exactly this. Correctly defining your data types is important and applicable in every language and every paradigm.

The view of functional programming is that objects shouldn't be transformed and that mutation should be avoided. That is unrelated.

The point of category theory is that different patterns of relationships are common across mathematical fields. Which is totally unrelated and has nothing to do with anything discussed here. Maybe you meant type theory? But that also has no relation.

Nope I meant category theory

> Maybe you meant type theory? But that also has no relation.

Hmm? Category theory and type theory have a lot of close ties. For example see the Nlab page on their connections https://ncatlab.org/nlab/show/relation+between+type+theory+a...

swyx · a year ago

A conclusion I reached a while ago: all the work we do in code is far more likely to be shorter lived than a single good decision that we have in data.

https://www.swyx.io/data-outlasts-code-but

delichon · a year ago

The good decisions are invisible. The bad decisions just seem to live forever.