This makes me realize: the one example I can think of where software has been "Lego-like" is Unix piping. I'm not a Unix purist or anything, but they really hit on something special when it came to "code talking to other code".
Speculating about what made it stand apart: it seems like the (enforced) simplicity of the interfaces between pieces of code. Just text streams; one in, one out, nothing more nothing less. No coming up with a host of parallel, named streams that all have their own behaviors that need to be documented. And good luck coming up with a complicated data protocol built atop your text stream; it won't work with anything else and so nobody will use your program.
Interface complexity was harshly discouraged just by the facts-on-the-ground of the ecosystem.
Compare that with the average library interface, or framework, or domain-specific language, or REST API, etc. etc. and it becomes obvious why integrating any of those things is more like performing surgery.
I think the best way to describe pipes to the uninitiated is in terms of copy & paste.
Copy & paste is the basic, ubiquitous, doesn't-try-to-do-too-much IPC mechanism that allows normal users to shovel data from one program into another. As simple as it is, it's indispensable, and it's difficult to imagine trying to use a computer without this feature.
The same applies to pipes, even though they work a little bit differently and are useful in slightly different situations. They're the "I just need to do this one thing" IPC mechanism for slightly more technical users.
> difficult to imagine trying to use a computer without [copy & paste]
Remember the first iPhone? I wasn't into it, but a bunch of my (senior developer) colleagues were. I asked them how they lived without copy & paste and they all told me it was just no big deal.
Rich Hickey addresses this in his famous 2011 talk, titled Simple Made Easy.
> Are we all not glad we don’t use the Unix method of communicating on the web? Right? Any arbitrary command string can be the argument list for your program, and any arbitrary set of characters can come out the other end. Let’s all write parsers.
The way that I think about this is that the Unix philosophy, which this behavior is undoubtedly representative of, is at one end of a spectrum, with something like strict typing at the other end. Rich, being a big proponent of what is described in the article as "Lego-like" development clearly does not prefer either end of the spectrum, but something in-between. In my opinion as well, the future of software development is somewhere in the middle of this spectrum, although exactly where the line should be drawn is a matter of trade-offs, not absolute best and worst. My estimation is that seasoned developers who have worked in many languages and in a variety of circumstances have all internalized this.
And yet, at least for the Unix tools I've used, nobody did write an elaborate parser. Instead, they all ended up using newlines to represent a sequence. There were never really nested structures at all. That least-common-denominator format they were forced into ended up making input and output really easy to quickly understand.
Maybe the problem with Rest APIs is that JSON does too good of a job at making it easy to represent complex structures? Maybe we'd be better off using CSV as our data format for everything.
> Are we all not glad we don’t use the Unix method of communicating on the web? Right? Any arbitrary command string can be the argument list for your program, and any arbitrary set of characters can come out the other end. Let’s all write parsers.
Um, but that's exactly what we do use on the web. Oh, sure, there's some really popular formats for input and output with widely available parsers (HTML, XML, JSON), and HTTP itself (as well as other protocols) specifies headers and other formats that need to be parsed before you get to those body formats, including telling you what those headers tell you about the other formats so you can no which parsers you need to write, or use if you can find one written for you.
There is also an audience mismatch. Most of the criticism of pipelines is from professional programmers.
A lot of the value of scripting, pipes and associated concepts is low barrier to entry.
I agree with the value of strong typing. But I also remember how infuriating it was to learn to work with a type system when I was learning to write code.
When I need a quick answer of some sort, iteratively piping crap at a series of "| grep | awk" is exactly what the doctor ordered. Sure, I could bullet proof it nicely and make it reusable by investing time the time to write it in something saner, but there's zero reason to - I'm not likely to ever want to perform the same action again.
> In my opinion as well, the future of software development is somewhere in the middle of this spectrum
Unfortunately this is complexity in and of itself. I don't disagree that different cases require different tools, however the split should be strongly weighted in one direction. Mostly IPC over pipes with a few exceptions, mostly REST and JSON with a few exceptions, mostly language X with a few exceptions. Everyone will have their own preferences, but I think it's important to pick a side (or at least mostly pick a side) or else you accept chaos
> Are we all not glad we don’t use the Unix method of communicating on the web? Right? Any arbitrary command string can be the argument list for your program, and any arbitrary set of characters can come out the other end.
He clearly uses a different web to the one I use. >.>
In practice it’s totally inscrutable. I never remember or even feel comfortable guessing at anything more than the most basic. Meanwhile, any typed library in language X usually works immediately with no docs given a decent IDE.
I would argue that it thrives in those most basic cases, and isn't really suited to building truly complex systems. But I also don't think there's anything wrong with that. There's a use-case for simple pieces that are easy to snap together, and I think that use-case has been greatly under-served because lots of things that aim for it end up as complicated, multi-faceted APIs.
You could almost say that micro-services are trying to follow in the Unix tradition. But the problem is that a) they don't really get used in that ideal, small-scale use-case because they're almost always written and consumed internally, not exposed to the public, and b) they do get used in those huge, complex cases where their lego-ness stops being a virtue and starts being a liability.
In your practice you have favored an alternative. (not "in practice", implying absolute)
In my practice I have used both with great success. For logging, parsing, and displaying playback data from field systems, UNIX and UNIX-like tools have been incredible. VNLog in particular is a wonderful way to interact with data if you need just a bit of structure on top of unix outputs.
And anyway, getting from not-so-great data to something that a typed library in language X can parse is a great job for plain old Unix tools.
I think piping works well because it's an opinionated framework for IPC. The strong opinions it holds are:
1) data is a stream of bytes
2) data is only a stream of bytes
That's it. And it turns out that's a pretty powerful abstraction... Except it requires the developer to write a lot of code to massage the data entering and/or leaving the pipe if either end of it thinks "stream of bytes" means something different. In the broad-and-flat space where it's most useful---text manipulation---it works great because every tool agrees what text is (kind of... Pipe some Unicode into something that only understands only ASCII and you're going to have a lousy day). When we get outside that space?
So while, on the one hand, it allows a series of processes to go from a text file to your audio hardware (neat!), on the other hand, it allows you to accidentally pipe /dev/random directly into your audio hardware, which, here's hoping you don't have your headphones turned all the way up.
This example also kind of handwaves something, in that you touched on it directly but called it a feature, not a bug: pipes are almost always the wrong tool if you do want structure. They're too flexible. It's way the wrong API for anything where you cannot afford to have any mistakes, because unless you include something in the pipe chain to sanity-check your data, who knows what comes out the other end?
A bit of a tangent, but... audio hardware can't really be treated as a stream of bytes either.
You used to be able to cat things to /dev/dsp, but- that used something like 8khz, 8-bit, mono audio. That's horrendous. Because, with just a stream of bytes, you have to settle for the least common denominator- /dev/dsp had IOCTLs to set sample rate, number of channels, and bit depth, but... with just a stream of bytes, you can't do that.
Similarly, video data via /dev/fb0 - AFAIK you don't even have defaults there to rely on, to display anything useful you need to do IOCTLs to find out about its format.
When do you not want structure? Seriously-
Plain, human-readable ASCII text is maybe a candidate - but even then there's implicit structure (things like handling CR/LF, tabs...)
Unicode text? You know that's structure. (Ever had a read() call return in the middle of a UTF-8 multi-byte sequence?
CSV? That's structure.
Tab-separated columns? That's structure.
Fixed-width columns? Also structure.
You don't get to not have structure. Structure is always there. The question is whether you get to have a spec for your structure, or whether it's just "well, from the output it looks like column 3 here is always the hostname of the server I want, so I'll use 'cut' or 'awk' to extract it". That approach can work in practice, but...
> That's it. And it turns out that's a pretty powerful abstraction
But that's not what an abstraction is! UNIX pipes are maximally low-level, as close to un-abstract as it is possible to get, except perhaps if it used bit-streams instead of byte-streams. It literally cannot get less abstract than that.
UNIX pipes are completely untyped, like a C codebase that uses only void* to pass all data. (Okay, some way of indicating end-of-stream is also needed, but it's still a good analogy.)
> pipes are almost always the wrong tool if you do want structure
Not almost the wrong tool, entirely the wrong tool.
You can't magically shoehorn types back into an untyped system when all the existing components assume untyped streams.
> I'm not a Unix purist or anything, but they really hit on something special when it came to "code talking to other code".
I agree, and it just hit me while reading your comment that the special thing is not just that you can plug any program into any other program. It's that if one program doesn't work cleanly with another, this enforced simplicity means that you can easily modify the output of one program to work with another program. Unix command-line programs aren't always directly composable but they're adaptable in a way that other interfaces aren't.
It's not great for infrastructure, don't get me wrong. This isn't nuts and bolts. It's putty. But often putty is all you need to funnel the flow of information this one time.
This is why functional programming and lisps are such fantastic development environment, because you can use components (functions) that are not very opinionated about what they are acting on.
Another fascinating thing is how we as a community reacted to this simplicity. One thing in particular that I find interesting is the conventions that have been built up. Many tools don't just accept text streams but react the same way to a common set of options and assume line-separation among other things. None of these are defined in the interface but were good ideas that were adopted between projects
Is is not the same description for REST API? You pass in a text body and get back a text body? Everyone uses JSON instead of a complicated data protocol?
A UNIX pipeline is more like a set of operations on the same object passing through the pipeline, and typically that object is a text file. REST APIs represent an ability to interact with a repository of information with a really specific protocol, yet the response is not in that same format as the input. You couldn't pass the response of a REST API into another REST API without managing it externally.
Consider
cat myData | toolStripsWhitespace > myData
Versus
var myData = RestApi->read(myDataRecordID)
var mutatedData = someMutation(myData)
RestApi->update(myDataRecordID, mutatedData)
You could of course write your ORM in a pipeline-like manner, many do. But that's got nothing to do with REST itself.
var myData = new SomeObject(RestApi->read(myDataRecordID))
->someMutation('params')
->someOtherMutation('params')
RestApi->update(myDataRecordID, myData->toJSON())
If you ever wondered why some people are obsessed with functional programming this is the reason why:
Functional programming forces every primitive in your program to be a Lego Block.
A lot of functional programmers don't see the big picture. They see a sort of elegance with the functional style, they like the immutability but they can't explain the practical significance to the uninitiated.
Functional Programming is the answer to the question that has plagued me as a programmer for years. How do I organize my program in such a way that it becomes endlessly re-useable from a practical standpoint?
Functional programming transforms organ transplantation into lego building blocks.
The "lego" is the "function" and "connecting two lego blocks" is "function composition".
In short, another name for "Point free style" programming is "Lego building block style" programming.
You can't just compose two functions because they're both written in a functional programming language. The programmer has to have the foresight to make their types compatible. I think the novelty of the Unix pipe for interoperability is that they (typically) work on one agreed-upon kind of data: human readable, white-space separated. So a lot of tools "just work" with each other.
There's no reason you can't do this with functional programming, but obviously you can do it with non-functional programming too, and you could certainly fail to do this with functional programming.
I think this is a rather simplified and naïve analysis. Getting functional programs and functional APIs to compose well with each other is just as much a challenge as in other language paradigms. Just because the logic is organized as functions doesn’t magically make things fit together. Your APIs need to speak in a consistent way as well, and the arrangement of your data needs to be the same or easily convertible between your “Lego pieces“. Having spent many years of my career writing functional code all day, it is just as easy to make a mess of things in functional programs as it is an object oriented programs. I do not believe either is inherently better at creating the “Lego“–style.
The goal of OOP is also to be modular and composable. I think the thing that is a good design choice is modular and composible. It's what the industrial revolution / assembly line was based on. It's what vim keybindings are based on. Heck, it's what programming languages themselves are based on. Here are some simple tools that do easily understandable things together, now put something together with it. Lego bricks are fun and useful and it's not the domain of any one area of CS.
You really just turned functional programming around for me. I learned CompSci object oriented (C++) but have always loved how easy data analysis was in unix output. Cheers for making me want to give it another look!
> Unix piping is basically functional programming.
Only in the same sense that all computing is Turing Machines or NAND gates.
This is a very common misunderstanding, but there is a reason that FP, which is transformational, had to adopt dataflow in order to sort-of handle reactive systems.
Functions run to completion and return their result. Filters tend to run concurrently and, importantly, do not return their results, they pass them on to the next filter in the pipeline.
It is possible to compose any two filters. It is not possible to compose any two functions, not even close, the default for functions is to not compose (arity, parameter types, return type,...).
Want to build web apps from reusable blocks? Reason at a higher level about chatrooms, roles and permissions, credits? That was the thinking behind our open source project:
One thing I appreciate about John D. Cook's blog is that he doesn't feel the need to pad out what he wants to say.
Here, he had a thought, and he expressed it in two paragraphs. I'm sure he could have riffed on the core idea for another 10 paragraphs, developed a few tangential lines of thought, inserted pull quotes -- in short, turned it into a full-blown essay.
Given that his blog serves, at least in part, as an advertisement for his services, he even has some incentive to demonstrate how comprehensive and "smart" he can be.
His unpadded style means I'm never afraid to check out a link to one of his posts on HN. Whereas I will often forego clicking on links to Medium, or the Atlantic, or wherever, until I have looked at a few comments to see whether it will be worth my time.
Honestly I was on the fence about clicking the link until I saw where it was from--his content is reliably interesting and straight to the point. If it was on Medium I wouldn't have even bothered and, like you, would have gone to the comments. The compression is lossy but it's a great filter for crap content.
There was a science teacher in my high school that used to have a rule about doing things similarly. For any kind of a lab report, instead of "you must write a report of at least 3 pages", it was "your report must not be more than 2 pages long."
Not only am I sure it made it easier for him to grade, but it really forced students to write concisely about their work.
For what it's worth, as an advertisement for his services, conciseness is better. It's easier to disagree with parts of a detailed opinion than with a vague general statement.
You can then project your own opinions into the general framework, and you find you fully agree :)
As a consultant, "I 100% agree with you, you understand me" is exactly the feeling you want.
He writes the long articles that show off his smarts in fairly specialized areas, where you need to be an expert to disagree.
It's really clever, and I'm curious if it's intentional on his part, or just his style.
This is the first time I've seen one his posts. It caught me really off guard but I completely agree with your sentiments. It is refreshing to see and I wish more blogs took this method to heart
Haskell is much closer to the lego blocks analogy than most languages I've tried due to the focus on composition and polymorphism.
The teetering edge that grinds some people's gears are monads which don't compose generally but do compose in specific, concrete ways a-la monad transformers. The next breakthrough here, I think, is going to be the work coming out of effect systems based on free(r) monads and delimited continuations. Once the dust settles here I think we'll have a good language for composing side effects as well.
In the current state of things I think heart-surgery is an apt metaphor. The lego brick analogy works for small, delimited domains with denotational semantics. "Workflow" languages and such.
I like Haskell, I write Haskell at my day job (and did so at my previous day job), and I help maintain some of the community build infrastructure so I’m familiar with a large-ish graph of the Haskell ecosystem and how things fit together.[0]
I don’t really think Haskell is _meaningfully_ superior than other languages at the things that OP is talking about.
Refactoring Haskell _in the small_[1] is much nicer than many other languages, I don’t disagree on that point. Despite this, Haskell applications are _just as susceptible_ to the failures of software architecture that bind components of software together as other languages are.
In some cases I would even suggest that combining two Haskell applications can be _more_ fraught than in other languages, as the language community doesn’t have much in the way of agreed-upon design patterns that provide common idioms that can be used to enmesh them cleanly.
[0] I’m mostly belaboring these points to establish that I’m not talking out of my ass, and that I’ve at least got some practical experience to back up my points.
[1] This is to say when one refractors individual functions collections of interlocking abstraction
> Despite this, Haskell applications are _just as susceptible_ to the failures of software architecture that bind components of software together as other languages are.
I think it's more complicated than this. Yes, you can push poorly-architected Haskell to production & be in a rough spot. However, my experience says that even the gnarliest Haskell is easier to improve than any other language.
Because of the types, purity, etc, I find that it's much easier to zoom around a codebase without tracing every point in between. I can typically make one small change to "crack things open" [1], follow GHC's guidance, and then go from there. I've been able to take multiple large Haskell projects that other engineers deemed unfixable (to the point where there were talks of rewrites) & just fix them mechanically and have them live & improve continuously for years to come.
The big thing with Haskell IME is you don't really need to have design patterns that everyone follows. I don't freak out when I see multiple different idioms used in the same codebase because idgaf about folk programming aesthetic. If an idiom is used, I follow it. It's all mechanical. I barely use my brain when coding professionally in Haskell. I save it all for the higher-level work. Wish I could say that about professionally programming in other languages of equal experience :/
So while it's just as susceptible (because good vs bad software architecture is more a function of time & effort) it's also typically pretty braindead to fix.
[1] A favorite technique is to add a new case to a key datatype and have its body be Void. Then I just follow the pattern match errors & sprinkle in `absurd`. I now have a fork in the road that is actually a knowably a no-op at runtime.
WAI is a great example on the sort of compat/interop interfaces that are more common and easier to rollout in Haskell than in non-(typed functional) languages.
You've got far more Haskell experience than I do, but I have done some pretty heavy refactoring on large java codebases. The process always seemed to be, tease out some interfaces and switch the implementation of those interfaces. Over and over and over. I could lean on javac and tests but some knots are hard to untangle and take a long time.
I believe you that the in the large it's still hard. It seems so much more pleasant day to day untangling that big ball of string with Haskell rather than Java.
Functional programming does it better - but it still suffers from the issue that when developing a programming solution we developers need to account for all edge cases (if we're doing it right) which requires a lot of decisions to be made. The important decisions for the use of the module will be carefully made - the less important decisions will be arbitrarily made. Almost no uses of the module will encompass a need for every edge case to be decided in a specific direction but when that module is reused the new consumer will probably have a slightly different requirement about which decisions go which ways - this, I think, is the central pain point of software reuse.
Completely agnostic problems do exist and modules to solve those can be very strong - but that is a small subset of all the problems we want modules for.
And yet back in 1993, Visual Basic programmers were able to reuse software by literally snapping together controls like Lego blocks. There was a rich ecosystem of third-party controls. Competing tools such as Delphi had similar features. Since then the industry has gone backwards, or at best sideways, in many areas.
I wanted to make a similar point with piping UNIX commands. I can think of two reasons why the degradation happened:
1. Expansion of the software universe. Back in VB6 times, there were fewer programmers but many languages. Reusing components made with different languages was a big deal (VB used COM/ActiveX machinery to make this possible), but today there are so many more developers, that each language/ecosystem is big enough to exist on an island and happily not interact with anything unless it's a grpc/rest endpoint.
2. Transition to SaaS. We no longer use the same platform for building and running. Your VB app used to run on more or less the same computer it was built on. SaaS applications run in weird, custom-made, hard to reproduce and fragile computers called "environments". They are composed from all kinds of complex bits and this makes SaaS applications less portable. Frankly, they feel more like "environment configuration" than "code" sometimes.
The SaaS expansion could make componentisation easier - it would just be microservices, but from different companies. Somehow it doesn't. Possibly because it's not in their interest to do so.
On the other hand .. look at how much software goes into e.g. car entertainment systems. How many systems have "curl" in, for example. Have a look at the vast list of BSD license acknowledgements in many systems.
Look at the npm ecosystem, where people complain that there's too much componentisation.
Imagine two large structures made of legos, say a man-figure and a car-figure. How easy would it be to snap those together, to make a "man in a race car" figure? A moments thought would show it would be quite hard. Neither is likely to have merely a flat surface.
So basically, increased complexity of components, any components, makes them harder to combine - except in very carefully controlled circumstances.
It's neither as bad as an organ transplant, nor as easy as LEGO.
It is also highly variable, dependent upon the SDKs and API choices.
I've written SDKs for decades. Some are super simple, where you just add the dylib or source file, and call a function, and other ones require establishing a context, instantiating one (or more) instances, and setting all kinds of properties.
I think it's amusing, and accurate, in many cases; but, like most things in life, it's not actually that simple.
xargs/etc are like arcane legos, where there's more than one brick vendor in the mix with different vendor's versions of the brick supporting different features, or the same feature but implemented in different ways. And even if you're only dealing with one brick vendor, there's still little uniformity in the interfaces of different bricks. GNU vs BSD xargs, -0 vs -print0, etc.
True lego have the property that if two parts snap together, it's a valid configuration. That's definitely not the case for unix utilities. You have to read a manual for each sort of brick.
I think the author is discussing large existing apps, for instance connecting something like an externally built authentication layer to an existing user management suite. It's not just plug and play but a series of careful surgical moves.
Obviously shuffling data around is a different (easier) beast, especially for one off tasks.
This is an astute metaphor. In my experience software reuse simplicity strongly depends on the following factors:
* interface surface area (i.e. how much of an interface is exposed)
* data types of data coming in and out (static or dynamic). Static languages have an advantage here as many integration constraints can be expressed with types.
* whether it is a very focused functionality (e.g. extracting EXIF from file) vs cross-cutting concerns (e.g. logging)
The more limited surface area, the simpler the data types and invariants, the more localized it is - the more it is like LEGO as opposed to an organ transplant
For reusing software source I agree. The only current way around this is with the unix pipe system where you reuse software _executables_ instead of software _source code_
The reason it works is because unix softwares agree to a simple contract of reading from stdin and writing to stdout, which is very limiting in terms of concurrency but unlocks a huge world of compatibility.
I wonder if we will ever get software legos without the runtime bloat from forking.
ps: to anyone countering with examples of languages that are reusable through modules, that doesn't count because you are locked in to a given language.
> I wonder if we will ever get software legos without the runtime bloat from forking.
In a sense, shared object files / dynamically linked libraries meet this criteria -- they can be loaded into program memory and used by a single process.
There's also flowgraph-based signal processing systems, like gnuradio, which heavily use the concept of pipes (usually a stream of numbers or a stream of vectors of numbers) but, as I understand it, don't require OS forking. (Though they do implement their own schedulers for concurrency, and for gnuradio at least, blocks are typically shipped as source so I'm not sure whether that counts as reusing executables vs. reusing source code.)
Speculating about what made it stand apart: it seems like the (enforced) simplicity of the interfaces between pieces of code. Just text streams; one in, one out, nothing more nothing less. No coming up with a host of parallel, named streams that all have their own behaviors that need to be documented. And good luck coming up with a complicated data protocol built atop your text stream; it won't work with anything else and so nobody will use your program.
Interface complexity was harshly discouraged just by the facts-on-the-ground of the ecosystem.
Compare that with the average library interface, or framework, or domain-specific language, or REST API, etc. etc. and it becomes obvious why integrating any of those things is more like performing surgery.
Copy & paste is the basic, ubiquitous, doesn't-try-to-do-too-much IPC mechanism that allows normal users to shovel data from one program into another. As simple as it is, it's indispensable, and it's difficult to imagine trying to use a computer without this feature.
The same applies to pipes, even though they work a little bit differently and are useful in slightly different situations. They're the "I just need to do this one thing" IPC mechanism for slightly more technical users.
Remember the first iPhone? I wasn't into it, but a bunch of my (senior developer) colleagues were. I asked them how they lived without copy & paste and they all told me it was just no big deal.
> Are we all not glad we don’t use the Unix method of communicating on the web? Right? Any arbitrary command string can be the argument list for your program, and any arbitrary set of characters can come out the other end. Let’s all write parsers.
The way that I think about this is that the Unix philosophy, which this behavior is undoubtedly representative of, is at one end of a spectrum, with something like strict typing at the other end. Rich, being a big proponent of what is described in the article as "Lego-like" development clearly does not prefer either end of the spectrum, but something in-between. In my opinion as well, the future of software development is somewhere in the middle of this spectrum, although exactly where the line should be drawn is a matter of trade-offs, not absolute best and worst. My estimation is that seasoned developers who have worked in many languages and in a variety of circumstances have all internalized this.
And yet, at least for the Unix tools I've used, nobody did write an elaborate parser. Instead, they all ended up using newlines to represent a sequence. There were never really nested structures at all. That least-common-denominator format they were forced into ended up making input and output really easy to quickly understand.
Maybe the problem with Rest APIs is that JSON does too good of a job at making it easy to represent complex structures? Maybe we'd be better off using CSV as our data format for everything.
Um, but that's exactly what we do use on the web. Oh, sure, there's some really popular formats for input and output with widely available parsers (HTML, XML, JSON), and HTTP itself (as well as other protocols) specifies headers and other formats that need to be parsed before you get to those body formats, including telling you what those headers tell you about the other formats so you can no which parsers you need to write, or use if you can find one written for you.
A lot of the value of scripting, pipes and associated concepts is low barrier to entry.
I agree with the value of strong typing. But I also remember how infuriating it was to learn to work with a type system when I was learning to write code.
When I need a quick answer of some sort, iteratively piping crap at a series of "| grep | awk" is exactly what the doctor ordered. Sure, I could bullet proof it nicely and make it reusable by investing time the time to write it in something saner, but there's zero reason to - I'm not likely to ever want to perform the same action again.
Unfortunately this is complexity in and of itself. I don't disagree that different cases require different tools, however the split should be strongly weighted in one direction. Mostly IPC over pipes with a few exceptions, mostly REST and JSON with a few exceptions, mostly language X with a few exceptions. Everyone will have their own preferences, but I think it's important to pick a side (or at least mostly pick a side) or else you accept chaos
Uzbl is a collection of "web interface tools" that adhere to the Unix philosophy, that come together to create a browser.
https://www.uzbl.org/
He clearly uses a different web to the one I use. >.>
You could almost say that micro-services are trying to follow in the Unix tradition. But the problem is that a) they don't really get used in that ideal, small-scale use-case because they're almost always written and consumed internally, not exposed to the public, and b) they do get used in those huge, complex cases where their lego-ness stops being a virtue and starts being a liability.
In my practice I have used both with great success. For logging, parsing, and displaying playback data from field systems, UNIX and UNIX-like tools have been incredible. VNLog in particular is a wonderful way to interact with data if you need just a bit of structure on top of unix outputs.
And anyway, getting from not-so-great data to something that a typed library in language X can parse is a great job for plain old Unix tools.
1) data is a stream of bytes
2) data is only a stream of bytes
That's it. And it turns out that's a pretty powerful abstraction... Except it requires the developer to write a lot of code to massage the data entering and/or leaving the pipe if either end of it thinks "stream of bytes" means something different. In the broad-and-flat space where it's most useful---text manipulation---it works great because every tool agrees what text is (kind of... Pipe some Unicode into something that only understands only ASCII and you're going to have a lousy day). When we get outside that space?
So while, on the one hand, it allows a series of processes to go from a text file to your audio hardware (neat!), on the other hand, it allows you to accidentally pipe /dev/random directly into your audio hardware, which, here's hoping you don't have your headphones turned all the way up.
This example also kind of handwaves something, in that you touched on it directly but called it a feature, not a bug: pipes are almost always the wrong tool if you do want structure. They're too flexible. It's way the wrong API for anything where you cannot afford to have any mistakes, because unless you include something in the pipe chain to sanity-check your data, who knows what comes out the other end?
You used to be able to cat things to /dev/dsp, but- that used something like 8khz, 8-bit, mono audio. That's horrendous. Because, with just a stream of bytes, you have to settle for the least common denominator- /dev/dsp had IOCTLs to set sample rate, number of channels, and bit depth, but... with just a stream of bytes, you can't do that.
Similarly, video data via /dev/fb0 - AFAIK you don't even have defaults there to rely on, to display anything useful you need to do IOCTLs to find out about its format.
When do you not want structure? Seriously-
Plain, human-readable ASCII text is maybe a candidate - but even then there's implicit structure (things like handling CR/LF, tabs...)
Unicode text? You know that's structure. (Ever had a read() call return in the middle of a UTF-8 multi-byte sequence?
CSV? That's structure.
Tab-separated columns? That's structure.
Fixed-width columns? Also structure.
You don't get to not have structure. Structure is always there. The question is whether you get to have a spec for your structure, or whether it's just "well, from the output it looks like column 3 here is always the hostname of the server I want, so I'll use 'cut' or 'awk' to extract it". That approach can work in practice, but...
But that's not what an abstraction is! UNIX pipes are maximally low-level, as close to un-abstract as it is possible to get, except perhaps if it used bit-streams instead of byte-streams. It literally cannot get less abstract than that.
UNIX pipes are completely untyped, like a C codebase that uses only void* to pass all data. (Okay, some way of indicating end-of-stream is also needed, but it's still a good analogy.)
> pipes are almost always the wrong tool if you do want structure
Not almost the wrong tool, entirely the wrong tool.
You can't magically shoehorn types back into an untyped system when all the existing components assume untyped streams.
PowerShell's structured and typed object streams is more UNIX than UNIX: https://news.ycombinator.com/item?id=23423650
I agree, and it just hit me while reading your comment that the special thing is not just that you can plug any program into any other program. It's that if one program doesn't work cleanly with another, this enforced simplicity means that you can easily modify the output of one program to work with another program. Unix command-line programs aren't always directly composable but they're adaptable in a way that other interfaces aren't.
It's not great for infrastructure, don't get me wrong. This isn't nuts and bolts. It's putty. But often putty is all you need to funnel the flow of information this one time.
This is why functional programming and lisps are such fantastic development environment, because you can use components (functions) that are not very opinionated about what they are acting on.
BTW if I was making UNIX command line today it would use LinkedHashMaps for everything instead of text streams.
Deleted Comment
Deleted Comment
Consider
Versus You could of course write your ORM in a pipeline-like manner, many do. But that's got nothing to do with REST itself.If you ever wondered why some people are obsessed with functional programming this is the reason why:
Functional programming forces every primitive in your program to be a Lego Block.
A lot of functional programmers don't see the big picture. They see a sort of elegance with the functional style, they like the immutability but they can't explain the practical significance to the uninitiated.
Functional Programming is the answer to the question that has plagued me as a programmer for years. How do I organize my program in such a way that it becomes endlessly re-useable from a practical standpoint?
Functional programming transforms organ transplantation into lego building blocks.
The "lego" is the "function" and "connecting two lego blocks" is "function composition".
In short, another name for "Point free style" programming is "Lego building block style" programming.
There's no reason you can't do this with functional programming, but obviously you can do it with non-functional programming too, and you could certainly fail to do this with functional programming.
Only in the same sense that all computing is Turing Machines or NAND gates.
This is a very common misunderstanding, but there is a reason that FP, which is transformational, had to adopt dataflow in order to sort-of handle reactive systems.
Functions run to completion and return their result. Filters tend to run concurrently and, importantly, do not return their results, they pass them on to the next filter in the pipeline.
It is possible to compose any two filters. It is not possible to compose any two functions, not even close, the default for functions is to not compose (arity, parameter types, return type,...).
https://qbix.com/platform
Reusability on the web. Here is where we are going:
https://qbix.com/QBUX/whitepaper.html#Distributed-Operating-...
Except you litterally use side effects to communicate. Not really FP, that part.
Here, he had a thought, and he expressed it in two paragraphs. I'm sure he could have riffed on the core idea for another 10 paragraphs, developed a few tangential lines of thought, inserted pull quotes -- in short, turned it into a full-blown essay.
Given that his blog serves, at least in part, as an advertisement for his services, he even has some incentive to demonstrate how comprehensive and "smart" he can be.
His unpadded style means I'm never afraid to check out a link to one of his posts on HN. Whereas I will often forego clicking on links to Medium, or the Atlantic, or wherever, until I have looked at a few comments to see whether it will be worth my time.
I haven't thought about this very much, and there is a lot I'm curious about that he hasn't elaborated on.
What are the signs of rejection? Whats an example of failure, are there examples of that wonderful modular behavior that he admires?
Its a nice way to introduce a thought or observation, but I want to know more about why he thinks that, not what he thinks.
Not only am I sure it made it easier for him to grade, but it really forced students to write concisely about their work.
You can then project your own opinions into the general framework, and you find you fully agree :)
As a consultant, "I 100% agree with you, you understand me" is exactly the feeling you want.
He writes the long articles that show off his smarts in fairly specialized areas, where you need to be an expert to disagree.
It's really clever, and I'm curious if it's intentional on his part, or just his style.
The teetering edge that grinds some people's gears are monads which don't compose generally but do compose in specific, concrete ways a-la monad transformers. The next breakthrough here, I think, is going to be the work coming out of effect systems based on free(r) monads and delimited continuations. Once the dust settles here I think we'll have a good language for composing side effects as well.
In the current state of things I think heart-surgery is an apt metaphor. The lego brick analogy works for small, delimited domains with denotational semantics. "Workflow" languages and such.
I don’t really think Haskell is _meaningfully_ superior than other languages at the things that OP is talking about.
Refactoring Haskell _in the small_[1] is much nicer than many other languages, I don’t disagree on that point. Despite this, Haskell applications are _just as susceptible_ to the failures of software architecture that bind components of software together as other languages are.
In some cases I would even suggest that combining two Haskell applications can be _more_ fraught than in other languages, as the language community doesn’t have much in the way of agreed-upon design patterns that provide common idioms that can be used to enmesh them cleanly.
[0] I’m mostly belaboring these points to establish that I’m not talking out of my ass, and that I’ve at least got some practical experience to back up my points.
[1] This is to say when one refractors individual functions collections of interlocking abstraction
Glomming together functions that operate on very abstract data structures feels a lot more like Legos than wiring traditional imperative/OO code.
I think it's more complicated than this. Yes, you can push poorly-architected Haskell to production & be in a rough spot. However, my experience says that even the gnarliest Haskell is easier to improve than any other language.
Because of the types, purity, etc, I find that it's much easier to zoom around a codebase without tracing every point in between. I can typically make one small change to "crack things open" [1], follow GHC's guidance, and then go from there. I've been able to take multiple large Haskell projects that other engineers deemed unfixable (to the point where there were talks of rewrites) & just fix them mechanically and have them live & improve continuously for years to come.
The big thing with Haskell IME is you don't really need to have design patterns that everyone follows. I don't freak out when I see multiple different idioms used in the same codebase because idgaf about folk programming aesthetic. If an idiom is used, I follow it. It's all mechanical. I barely use my brain when coding professionally in Haskell. I save it all for the higher-level work. Wish I could say that about professionally programming in other languages of equal experience :/
So while it's just as susceptible (because good vs bad software architecture is more a function of time & effort) it's also typically pretty braindead to fix.
[1] A favorite technique is to add a new case to a key datatype and have its body be Void. Then I just follow the pattern match errors & sprinkle in `absurd`. I now have a fork in the road that is actually a knowably a no-op at runtime.
https://github.com/yesodweb/wai
I believe you that the in the large it's still hard. It seems so much more pleasant day to day untangling that big ball of string with Haskell rather than Java.
Completely agnostic problems do exist and modules to solve those can be very strong - but that is a small subset of all the problems we want modules for.
See JVM (garbage collection), React, Datomic
Functions and scalar values is probably enough
1. Expansion of the software universe. Back in VB6 times, there were fewer programmers but many languages. Reusing components made with different languages was a big deal (VB used COM/ActiveX machinery to make this possible), but today there are so many more developers, that each language/ecosystem is big enough to exist on an island and happily not interact with anything unless it's a grpc/rest endpoint.
2. Transition to SaaS. We no longer use the same platform for building and running. Your VB app used to run on more or less the same computer it was built on. SaaS applications run in weird, custom-made, hard to reproduce and fragile computers called "environments". They are composed from all kinds of complex bits and this makes SaaS applications less portable. Frankly, they feel more like "environment configuration" than "code" sometimes.
On the other hand .. look at how much software goes into e.g. car entertainment systems. How many systems have "curl" in, for example. Have a look at the vast list of BSD license acknowledgements in many systems.
Look at the npm ecosystem, where people complain that there's too much componentisation.
We don't fully get back to component reuse, but it makes the sharing of services much more feasible, and more portable as well.
So basically, increased complexity of components, any components, makes them harder to combine - except in very carefully controlled circumstances.
So does HTML/CSS/JS, and they'll also often be adapted to the various popular front end frameworks of the day - Angular, React, Vue currently.
I also find it easier to customize and compose 3rd party UI components on the web than I did back in the 90's with VB, Delphi, MFC, COM, etc.
It's neither as bad as an organ transplant, nor as easy as LEGO.
It is also highly variable, dependent upon the SDKs and API choices.
I've written SDKs for decades. Some are super simple, where you just add the dylib or source file, and call a function, and other ones require establishing a context, instantiating one (or more) instances, and setting all kinds of properties.
I think it's amusing, and accurate, in many cases; but, like most things in life, it's not actually that simple.
I feel like this is trying to argue for more “consulting surgeons” when we need more “tooling machinists” who know how to make a good LEGO block.
True lego have the property that if two parts snap together, it's a valid configuration. That's definitely not the case for unix utilities. You have to read a manual for each sort of brick.
It is rather unfortunate for us that everything that's come afterwards has been in some way even worse.
Obviously shuffling data around is a different (easier) beast, especially for one off tasks.
Deleted Comment
Dead Comment
* interface surface area (i.e. how much of an interface is exposed)
* data types of data coming in and out (static or dynamic). Static languages have an advantage here as many integration constraints can be expressed with types.
* whether it is a very focused functionality (e.g. extracting EXIF from file) vs cross-cutting concerns (e.g. logging)
The more limited surface area, the simpler the data types and invariants, the more localized it is - the more it is like LEGO as opposed to an organ transplant
The reason it works is because unix softwares agree to a simple contract of reading from stdin and writing to stdout, which is very limiting in terms of concurrency but unlocks a huge world of compatibility.
I wonder if we will ever get software legos without the runtime bloat from forking.
ps: to anyone countering with examples of languages that are reusable through modules, that doesn't count because you are locked in to a given language.
In a sense, shared object files / dynamically linked libraries meet this criteria -- they can be loaded into program memory and used by a single process.
There's also flowgraph-based signal processing systems, like gnuradio, which heavily use the concept of pipes (usually a stream of numbers or a stream of vectors of numbers) but, as I understand it, don't require OS forking. (Though they do implement their own schedulers for concurrency, and for gnuradio at least, blocks are typically shipped as source so I'm not sure whether that counts as reusing executables vs. reusing source code.)