Readit News logoReadit News
ristos · a year ago
Micro-libraries are really good actually, they're highly modular, self-contained code, often making it really easy to understand what's going on.

Another advantage is that because they're so minimal and self-contained, they're often "completed", because they achieved what they set out to do. So there's no need to continually patch it for security updates, or at least you need to do it less often, and it's less likely that you'll be dealing with breaking changes.

The UNIX philosophy is also build on the idea of small programs, just like micro-libraries, of doing one thing and one thing well, and composing those things to make larger things.

I would argue the problem is how dependencies in general are added to projects, which the blog author pointed out with left-pad. Copy-paste works, but I would argue the best way is to fork the libraries and add submodules to your project. Then if you want to pull a new version of the library, you can update the fork and review the changes. It's an explicit approach to managing it that can prevent a lot of pitfalls like malicious actors, breaking changes leading to bugs, etc.

foul · a year ago
Micro-libraries anywhere else are everything you said: building blocks that come after a little study of the language and its stdlib and will speed up development of non-trivial programs.

In JS and NPM they are a plague, because they promise to be a substitute for competence in basic programming theory, competence in JS, gaps and bad APIs inside JS, and de-facto standards in the programming community like the oldest operating functions in libc.

There are a lot of ways for padding a number in JS and a decent dev would keep an own utility library or hell a function to copy-paste for that. But no. npm users are taught to fire and forget, and update everything, no concept of vendoring (that would have made incidents like left-pad, faker and colors less maddening, while vendoring is even bolt in npm and it's very good!). They for years copy-pasted in the wrong window, really, they should copypaste blocks of code and not npm commands. And God helps you if you type out your npm commands because bad actors have bought the trend and made millions of libraries with a hundred different scams waiting for fat fingers.

By understanding that JS in the backend is optimizing for reducing cost whatever the price, becoming Smalltalk for the browser and for PHP devs, you would expect some kind of standard to emerge for having a single way to do routine stuff. Instead in JS-world you get TypeScript, and in a future maybe WASM. JS is just doomed. Like, we are doomed if JS isn't, to be honest.

ivan_gammel · a year ago
The whole web stack must die and be replaced. JS, CSS, HTML, HTTP are huge cost center for global economy.
orhmeh09 · a year ago
Could you link to somebody who is teaching npm users to "fire and forget?" Someone who is promising a substitute for competence in basic programming theory? Clearly you and I do not consume the same content.
porcoda · a year ago
The UNIX philosophy is being a bit abused for this argument. Most systems that fall under the UNIX category are more or less like a large batteries-included standard library: lots of little composable units that ship together. UNIX in practice is not about getting a bare system and randomly downloading things from a bunch of disjointed places like tee and cat and head and so on, and then gluing them together and perpetually having to keep them updated independently.
ristos · a year ago
They ship together because all of those small composable units, that were once developed by random people, were turned into a meta-package at some point. I agree with you that randomly downloading a bunch of disjointed things without auditing and forking it isn't good practice.

I'm also not arguing against a large popular project with a lot of contributors if it's made up of a lot of small, modular, self-contained code that's composed together and customizable. All the smaller tools will probably work seamlessly together. I think UNIX still operates under this sort of model (the BSDs).

There's a lot of code duplication and bad code out there, and way too much software that you can't really modify easily or customize very well for your use case because it becomes an afterthought. Even if you did learn a larger codebase, if it's not made up of smaller modular parts, then whatever you modify has a significantly higher chance of not working once the library gets updated, because it's not modular, and you updated internal code, and the library authors aren't going to worry about breaking changes for someone who's maintaining a fork of their library that changes internal code.

syncsynchalt · a year ago
> randomly downloading things from a bunch of disjointed places like tee and cat and head and so on, and then gluing them together and perpetually having to keep them updated independently.

I have distressing news about my experience using Linux in the '90s

wizzwizz4 · a year ago
We should totally have a system like that, though. It'd be such a great learning environment.
ivan_gammel · a year ago
> So there's no need to continually patch it for security updates, or at least you need to do it less often, and it's less likely that you'll be dealing with breaking changes.

Regardless of how supposedly good or small is the library, the frequency at which you need to check for updates is the same. It doesn’t have anything to do with the perceived or original quality of the code. Every 3rd party library has at least the dependency on platform and platforms are big, they have vulnerabilities and introduce breaking changes. Then there’s a question of trust and consistency of your delivery process. You won’t adapt your routines based on specifics of every tiny piece of 3rd party code, so you probably check for updates regularly and for everything at once. Then their size is no longer an advantage.

> Copy-paste works, but I would argue the best way is to fork the libraries and add submodules to your project. Then if you want to pull a new version of the library, you can update the fork and review the changes.

This sounds “theoretical” and is not going to work at scale. You cannot seriously expect application level developers to understand low level details of every dependency they want to use. For a meaningful code review of merges they must be domain experts, otherwise effectiveness of such approach will be very low - they will inevitably have to trust the authors and just merge without going into details.

ristos · a year ago
They don't need to understand the low level dependencies. People can create metapackages of a lot of a bunch of self-contained libraries that have been audited and forked, and devs can pull in the metapackages. The advantage is the modularity, which makes the code easier to audit and is more self-contained.

When's the last time ls, cat, date, tar, etc needed to be updated on your linux system? probably almost never. And composing them together always works. This set of linux tools, call it sbase, ubase, plan9 tools, etc, is one version of a metapackage. How often does a very large package need to be updated for bug fixes, security patches, or new versions?

GuB-42 · a year ago
If these libraries are so small, self-contained and "completed", why not just copy-paste these functions?

Submodules can work too, but do you really need these extra lines in your build scripts, extra files and directories, and the import lines just for a five line function? Copy-pasting is much simpler, with maybe a comment referring to the original source.

Note: there may be some legal reasons for keeping "micro-libraries" separate, or for not using them at all though but IANAL as they say.

5Qn8mNbc2FNCiVV · a year ago
As soon as source code is in your repo it's way more probable to getting touched. I'd never open that box ever because I don't want to waste time with my team touching code that they shouldn't when reviewing.

If you want the same functionality, build it according to the conventions in the codebase and strip out everything else that isn't required for the exact use case (since it's not a library anymore)

Barrin92 · a year ago
">The UNIX philosophy is also build on the idea of small programs, just like micro-libraries, of doing one thing and one thing well, and composing those things to make larger things."

The Unix philosophy is also built on willful neglect of systems thinking. The complexity of system isn't in the complexity of its parts but in the complexity of the interaction of its parts.

Putting ten micro-libraries together, even if each is simple, doesn't mean you have a simple program, in fact it doesn't even mean you have a working program, because that depends entirely on how your libraries play together. When you implement the content of micro-libraries yourself you have to be at the very least conscious not just of what, but how your code works, and that's a good first defense against putting parts together that don't fit.

ristos · a year ago
It's not a willful neglect of systems thinking. Functional programmers have been able to build very large programs made primarily of pure functions that are composed together. And it makes it much easier to debug as well, because everything is self-contained and you can easily decompose parts of the program. Same with the effectful code as well, leveraging things like algebraic effects.
alerighi · a year ago
> The UNIX philosophy is also build on the idea of small programs, just like micro-libraries, of doing one thing and one thing well, and composing those things to make larger things.

They have small programs, but that are not of different project. For example all the basic Linux utilities are developed and distributed as part of the GNU coreutils package.

It's the same of having a modular library, with multiple functions in them, that you can choose from. In fact the problem is that these function like isNumber shouldn't even be libraries, but should be in the language standard library itself.

tgv · a year ago
> I would argue the problem is how dependencies in general are added to projects

But you need the functionality anyway, so there are two dependencies: on your own code, or on someone else's code. But you can't avoid a dependency, and it comes at a cost.

If you don't know how to code the functionality, or it will take too much time, a library is an outcome. But if you need leftPad or isNumber as an external dependency, that's so far in the other direction, it's practically a sign of incompentence.

6510 · a year ago
If incompetent it provides a way to be sure?

Could you for laughs explain for which cases these are, why they are needed and why they did it this way?

1) num-num === 0

2) num.trim() !== ''

3) Number.isFinite(+num)

4) isFinite(+num)

5) return false;

6) Why this specific order of testing? Why prefer Number.isFinite over isFinite?

https://www.npmjs.com/package/is-number

   module.exports = function(num) {
     if (typeof num === 'number') {
       return num - num === 0;
     }
     if (typeof num === 'string' && num.trim() !== '') {
       return Number.isFinite ? Number.isFinite(+num) : isFinite(+num);
     }
     return false;
   };
I would have just....

    isNumber = num => isFinite(num+''.trim());
Why is that not precisely the same? (it isn't)

how about...

   function isNumber(num){
     switch(typeof num){
       case "number" : return !isNaN(num);
       case "string" : return isFinite(num) && !!num.trim();
     }
   }
Is there a difference?

IMHO NPM should have a discussion page for this. There are probably interesting answers for all of those looking to copy and paste.

reaperducer · a year ago
The UNIX philosophy is also build on the idea of small programs, just like micro-libraries, of doing one thing and one thing well, and composing those things to make larger things.

This year I started learning FORTH, and it's very much this philosophy. To build a building, you don't start with a three-story slab of marble. You start with a hundreds of perfect little bricks, and fit them together.

If you come from a technical ecosystem outside the Unix paradigm, it can be hard to grasp.

ristos · a year ago
Yeah, exactly! FORTH looks really awesome, I haven't gotten around to learning it much though. I heard it's addictive and fun.

Yeah, it's all concatenative programming: FORTH, unix pipes, function composition as monoids, effect composition as kliesli composition and monads, etc.

It makes it super useful for code readability (once you're familiar with the paradigm), and debugging, since you can split up and decompose any parts of your program to inspect and test those in isolation.

bborud · a year ago
This has nothing in common with the UNIX approach. Awk, grep, sort, less and the like are perhaps small, but not that small and not that trivial.
samatman · a year ago
Unix has yes, tr, cut, true, false, uniq, nl, id, fold, sort, sleep, head, tail, touch, wc, date, cal, echo, cat...

These are tiny programs.

I mean, sort has put on some weight over the years, sure. But if it were packaged up for npm people would call it a micro-library and tell you to just copy it into your own code.

kazinator · a year ago
Right! So if it is indeed so easy to understand what is going on, why would you need to make it an external dependency that can update itself behind your back?

If you understand what is going on, paste it into your tree.

mattlondon · a year ago
> Micro-libraries are really good actually, they're highly modular, self-contained code

Well I think that is the point, they're not self-contained. You are adding mystery stuff and who knows how deep the chain of dependencies go. See the left-pad fiasco that broke so much stuff, because the chain of transitive dependencies ran deep and wide.

NPM is a dumpster fire in this regard. I try to avoid it - is there a flag you can set to say "no downstream dependencies" or something when you add a dependency? At least that way you can be sure things really are self-contained.

a_wild_dandan · a year ago
There is a "no downstream dependencies" option; it's called writing/auditing everything yourself. Everything else -- be it libraries, monolithic SaaS platforms, a coworker's PR, etc. -- is a trade off between your time and your trust. Past that, we're all just playing musical chairs with where to place that trust. There's no right answer.
ristos · a year ago
Yeah there's a way to do that, yarn and pnpm can flatten the dependency tree. You can add the fork directly too:

yarn add <path/to/your/forked/micro-library.git>

pnpm add <path/to/your/forked/micro-library.git>

IgorPartola · a year ago
I remember adding a random date picker that pulled in a copy of React with it to a non-React project. NPM is a dumpster fire at a nuclear facility.
Toutouxc · a year ago
Do you know what else is all of that? Writing the five lines of code by hand. Or just letting a LLM generate it. This and everything else I want to reply has already been covered in the article.
ristos · a year ago
Nothing wrong with that either, like I said copy paste works too. A lot of minimalistic programs will just copy in another project.

Forking the code and using that is arguably nicer though IMO, makes it easier pull in new updates from the code, and to be able to track changes and bug fixes easier. I've tried both and find this approach nicer overall.

jvanderbot · a year ago
Micro libraries are ok - TFA even says you can use self-contained blocks as direct source.

Mirco dependencies are a god damn nuisance, especially with all the transitive micro-dependencies that come along, often with different versions, alternative implementations, etc.

Ygg2 · a year ago
If you're writing micro libraries, without intending to reuse them, why are you making it a library?
jaredsohn · a year ago
>I would argue the problem is how dependencies in general are added to projects

I haven't done anything with this myself (just brainstormed a bit with chatgpt) but I wonder if the solution is https://docs.npmjs.com/cli/v10/commands/npm-ci

Basically, enforce that all libraries have lock files and when you install a dependency use the exact versions it shipped with.

Edit: Can someone clarify why this doesn't work? Wouldn't it make installing node packages work the same way as it does in python, ruby, and other languages?

ristos · a year ago
I'm not sure why you're getting downvoted. The left-pad incident on npm primarily impacted projects that didn't have lockfiles or were not pinning exact versions of their dependencies. I knew a few functional programmers that would freeze the dependencies to an exact version before lockfiles came around, just to ensure it's reproducible and doesn't break in the future. Part of what was to blame was bad developer practice. I like npm ci.
mewpmewp2 · a year ago
These days with LLMs, doing leftPad yourself is incredibly easy, I would just do that.
VonGallifrey · a year ago
With LLMs? I don't think something like leftPad was every difficult to create.
prng2021 · a year ago
Why even stop at micro-libraries? Instead of "return num - num === 0" why not create the concept of pico-libraries people can use like "return isNumberSubtractedFromItselfZero(num)" ? It's basically plain English right?

You could say that if all the popular web frameworks in use today were rewritten to import and use hundreds of thousands of pico-libraries, their codebase would be, as you say, composed of many high modular, self contained pieces that are easy to understand.

/s

Dead Comment

Deleted Comment

oftenwrong · a year ago
The primary cause of the left-pad incident was that left-pad was removed from the npm registry. Many libraries depended on left-pad. The same could have occurred with any popular library, whether micro or not.

To reformulate the statement made in the intro of this post: "maybe it’s not a great idea to outsource _any critical_ functionality to random people on the internet."

It has long been a standard, best practice in software engineering to ensure dependencies are stored in and made available from first-party sources. For example, this could mean maintaining an internal registry mirror that permanently stores any dependencies that are fetched. It could also be done by vendoring dependencies. The main point is to take proactive steps to ensure your dependencies will always be there when you need them, and to not blindly trust a third-party to always be there to give your dependencies to you.

klabb3 · a year ago
> To reformulate the statement made in the intro of this post: "maybe it’s not a great idea to outsource _any critical_ functionality to random people on the internet."

Well everything is critical in the sense that a syntax error could break many builds and CI systems.

This is what lock files are for. If used properly, and the registry is available, there are no massive issues. This is how things are supposed work – all the tooling is made this way.

In short, I think the lessons from the leftpad debacle are (1) people don’t use existing versioning tooling, (2) there is a surprising amount of vendors involved if you look at dep trees for completely normal functionality and (3) the JS ecosystem is particularly fragmented with poor API discipline and non-existent stdlib.

EDIT: Just read up on it again and I misremembered. The author removed leftpad from NPM due to a dispute with the company regarding an unrelated package. That’s more of a mismanaged registry situation. You can’t mutate and remove published code without breaking things. Thus NPM wasn’t a good steward of their registry. If there’s a need to unpublish or mutate anything, there needs to be leeway and a path to migrate.

oftenwrong · a year ago
The key point is "If ... the registry is available", and the dependencies contained therein. We take on risk by relying on NPM to always be there and always provide us the dependencies we have already invested in. I'm arguing that organisations should take a more defensive stance against dependencies becoming unavailable. If you depend on it, keep a copy of it somewhere that you control.
Brian_K_White · a year ago
The problem with micro is 100 micros is 100x more surface area and chances than 1.
xg15 · a year ago
Micro libraries are worse than no libraries at all - but I maintain they are still better than gargantuan "frameworks" or everything-but-the-kitching-sink "util"/"commons" packages, where you end up only using a tiny fraction of the functionality but have to deal with the maintenance cost and attack surface of the whole thing.

If you're particularly unlucky, the unused functionality pulls in transitive dependencies of its own - and you end up with libraries in your dependency tree that your code is literally not using at all.

If you're even more unlucky, those "dead code" libraries will install their own event handlers or timers during load or will be picked up by some framework autodiscovery mechanism - and will actually execute some code at runtime, just not any code that provides anything useful to the project. I think an apt name for this would be "undead code". (The examples I have seem were from java frameworks like Spring and from webapps with too many autowired request filters, so I do hope that is no such an issue in JS yet)

zahlman · a year ago
> but I maintain they are still better than gargantuan "frameworks" or everything-but-the-kitching-sink "util"/"commons" packages, where you end up only using a tiny fraction of the functionality but have to deal with the maintenance cost and attack surface of the whole thing.

Indeed. Several toy projects I've done were blown up in size by four orders of magnitude because of Numpy.

I only want multi-dimensional arrays that support reshaping and basic element-wise arithmetic, maybe matrix multiplication; I'm not even that concerned about performance.

But I have to pay for countless numerical algorithms I've never even heard of provided by decades-old C and/or FORTRAN projects, plus even more higher-math concepts implemented in Python, Numpy's extensive (and fragmented - there's even compiled code for testing that's outside of any test folders) test suite that I'll never run myself, a bunch of backwards-compatibility hacks completely irrelevant to my use case, a python-to-fortran interface wrapper generator, a vendored copy of distutils even in the wheel, over 3MiB of .so files for random number generators, a bunch of C header files...

[Edit: ... and if I distribute an application, my users have to pay for all of that, too. They won't use those pieces either; and the likelihood that they can install my application into a venv that already includes NumPy is pretty low.]

I know it's fashionable to complain about dependency hell, but modularity really is a good thing. By my estimates, the total bandwidth used daily to download copies of NumPy from PyPI is on par with that used to stream the Baby Shark video from YouTube - assuming it's always viewed in 1080p. (Sources: yt-dlp info for file size; History for the Wikipedia article on most popular YouTube videos; pypistats.org for package download counts; the wheel I downloaded.)

DonHopkins · a year ago
Sometimes importing zombie "undead code" libraries can be beneficial!

I just refactored a bunch of python computer vision code that used detectron2 and yolo (both of which indirectly use OpenCV and PyTorch and lots of other stuff), and in the process of cleaning up unused code, I threw out the old imports of the yolo modules that we weren't using any more.

The yololess refactored code, which really didn't have any changes that should measurably affect the speed, ran a mortifying 10% slower, and I could not for the life of me figure out why!

Benchmarking and comparing each version showed that the yololess version was spending a huge amount of time with multiple threads fighting over locks, which the yoloful code wasn't doing.

But I hadn't changed anything relating to threads or locks in the refactoring -- I had just rearranged a few of the deck chairs on the Titanic and removed the unused yolo import, which seemed like a perfectly safe innocuous thing to do.

Finally after questioning all of my implicit assumptions and running some really fundamental sanity checks and reality tests, I discovered that the 10% slow-down in detectron2 was caused by NOT importing the yolo module that we were not actually using.

So I went over the yolo code I was originally importing line by line, and finally ran across a helpfully commented top-level call to fix an obscure performance problem:

https://github.com/ultralytics/yolov5/blob/master/utils/gene...

    cv2.setNumThreads(0)  # prevent OpenCV from multithreading (incompatible with PyTorch DataLoader)
Even though we weren't actually using yolo, just importing it, executing that one line of code fixed a terrible multithreading performance problem with OpenCV and PyTorch DataLoader fighting behind the scenes over locks, even if you never called yolo itself.

So I copied that magical incantation into my own detectron2 initialization function (not as top level code that got executed on import of course), wrote some triumphantly snarky comments to explain why I was doing that, and the performance problems went away!

The regression wasn't yolo's or detectron2's fault per se, just an obscure invisible interaction of other modules they were both using, but yolo shouldn't have been doing anything globally systemic like that immediately when you import it without actually initializing it.

But then I would have never discovered a simple way to speed up detectron2 by 10%!

So if you're using detectron2 without also importing yolo, make sure you set the number of cv2 threads to zero or you'll be wasting a lot of money.

conradludgate · a year ago
This is mortifying. This should not be acceptable implicit behaviour for imports to implicitly run code by simply existing
franciscop · a year ago
Seems a lot like the classic "I put only a couple of the strong advantages and enumerate everything I could think about as disadvantage". While I'm bias (I've done a bunch of these micro-libraries myself), there's more reasons I/OSS devs do them! To name other advantages (as a dev consuming them):

- Documentation: they are usually well documented, at least a lot better than your average internal piece of code.

- Portability: you learn it once and can use it in many projects, a lot easier than potentially copy/pasting a bunch of files from project to project (I used to do that and ugh what a nightmare it became!).

- Semi-standard: everyone in the team is on the same page about how something works. This works on top of the previous two TBF, but is distinct as well e.g. if you use Axios, 50% of front-end devs will already know how to use it (edit: removed express since it's arguably not micro though).

- Plugins: now with a single "source" other parties or yourself can also write plugins that will work well together. You don't need to do it all yourself.

- Bugs! When there are bugs, now you have two distinct "entities" that have strong motivation to fix the bugs: you+your company, and the dev/company supporting the project. Linus's eyeballs and all (yes, this has a negative side, but those are also covered in the cons in the article already!).

- Bugs 2: when you happen upon a bug, a 3rd party might've already found a bug and fixed it or offered an alternative solution! In fact I just did that today [1]

That said, I do have some projects where I explicitly recommend to copy/paste the code straight into your project, e.g. https://www.npmjs.com/package/nocolor (you can still install it though).

[1] https://github.com/umami-software/node/issues/1#issuecomment...

pton_xd · a year ago
Every team should eventually have some internal libraries of useful project-agnostic functionality. That addresses most of your points.

Copy-paste the code into your internal library and maintain it yourself. Don't add a dependency on { "assert": "2.1.0" }. It probably doesn't do what you actually want, anyway.

I think the more interesting point is that most projects don't know what they actually need and the code is disposable. In that scenario micro-libraries make some amount of sense. Just import random code and see how far you can get.

franciscop · a year ago
That's what I do for personal projects, I just run "npm publish"[1] on those and BAM it's managed and secured by npm and versioned instead of having to copy/paste or search old versions/new versions in Git history.

[1] I lied, I don't even run npm publish, I made my own tool for easy publishing so I just run `happy "Fixed X bug" --patch`

qwerty456127 · a year ago
> Micro-libraries should never be used. They should either be copy-pasted into your codebase, or not used at all.

I would prefer them to be built straight in the languages.

jdminhbg · a year ago
Yes, everyone seems to take the wrong lesson from left-pad. The reason left-pad happened on NPM isn't that there's something uniquely wrong with how NPM was built, but that JS has a uniquely barren standard library. People aren't writing their own left-pad functions in Java or Go or Python, it's just in the stdlib.
_xiaz · a year ago
At the same time Go is quite barren when it comes to list (slice) functions, but I largely agree with Java and Python
flysand7 · a year ago
I was about to jump into the comment section and say something along the lines of "but no one really thinks they're actually good, right?", only to see the top comment arguing they're good.
_xiaz · a year ago
Astounding that this is as polarizing of a take as it seems to be
userbinator · a year ago
and because it updates fairly frequently

I fail to comprehend how a single-function-library called "isNumber" even needs updating, much less "fairly frequently".

The debate around third-party code vs. self-developed is eternal. IMHO if you think you can do better than existing solutions for your use-case, then self-developed is the obvious choice. If you don't, then use third-party. This of course says a lot about those who need to rely on trivial libraries.

foul · a year ago
>I fail to comprehend how a single-function-library called "isNumber" even needs updating, much less "fairly frequently".

If someone uses isNumber as a fundamental building block and surrogate for Elm or Typescript (a transpiler intermediate that would treat number more soundly I hope), this poor soul whom I deeply pity will encounter a lot of strange edge-cases (like that one stated in the article: NaN is a number or not?) and if they fear the burden of forking the library they will try to inflict this burden upstream, enabling feature or conf bloat.

I insinuate that installation of isNumber is, like most of these basic microlibs, a symptom of incompetence in usage of the language. A worn JS dev would try isNaN(parseInt(num+'')) and sometime succeed.

flysand7 · a year ago
> [...] and sometime succeed

Nothing is ever certain when you program in javascript.

guestbest · a year ago
I think the updates are more for bugfixes around edge cases than feature additions.
consteval · a year ago
> I fail to comprehend how a single-function-library called "isNumber" even needs updating

Never underestimate the complexity and footgunny nature of JS' type system.

shiroiushi · a year ago
Until I read the comments here, I thought from the title that this was about those small neighborhood "libraries" that are basically a box the size of a very large birdhouse, mounted on a post, with a bunch of donated books inside that passersby are free to borrow. I was really wondering why someone would have a problem with these, unless they work for a book publisher.