What CSS minifiers also leave behind

This looks like a fun project indeed!

Unfortunately every time I read something about minifiers I got the feeling that people are optimizing the wrong problem.

If you gzip data over the line it's already compressed. So minifying your stuff will only help you a little.

The problem is on the client side. You can compress what you like but if the browser starts dropping frames because it has to compile/handle a ton of Javascript and CSS then minifying doesn't help the end user.

dspillett · 8 years ago

> If you gzip data over the line it's already compressed. So minifying your stuff will only help you a little.

For small files you might be mostly correct, but for larger ones min+compress can product much better gains than compression alone.

IIRC the algorithm used employs a rolling compression window, and can only match strings of tokens whose distance apart is smaller than that window. IIRC the default window is 8KBytes and the maximum is 32KBytes. Even if you use the maximum at the expense of CPU time that isn't going to cover many large files. Minifying increases the effective range of the compression window, each match is shorter but you will find more matches and usually this balances out in a way that benefits the compression result.

It isn't quite that simple in reality as there is huffman encoding and other tricks in the mix. This means that even for inputs smaller than the compression window you may see some benefit as minifying can reduce the input data's alphabet significantly.

Ignoring the "why it helps", it is easy to show that it does help in a great many real cases:

  ds@s2:/tmp$ wget --quiet https://code.jquery.com/jquery-3.2.0.min.js
  ds@s2:/tmp$ wget --quiet https://code.jquery.com/jquery-3.2.0.js
  ds@s2:/tmp$ gzip jquery-3.2.0.min.js
  ds@s2:/tmp$ gzip jquery-3.2.0.js
  ds@s2:/tmp$ ls -l j*
  -rw-r--r-- 1 ds ds 79201 Mar 16 21:30 jquery-3.2.0.js.gz
  -rw-r--r-- 1 ds ds 30023 Mar 16 21:30 jquery-3.2.0.min.js.gz

In this example the result of min+comp is less than 40% the size of the result from compression alone.

For completeness, minifying alone achieves less than compression alone:

  -rw-r--r-- 1 ds ds 267686 Mar 16 21:30 jquery-3.2.0.js
  -rw-r--r-- 1 ds ds  79201 Mar 16 21:30 jquery-3.2.0.js.gz
  -rw-r--r-- 1 ds ds  86596 Mar 16 21:30 jquery-3.2.0.min.js
  -rw-r--r-- 1 ds ds  30023 Mar 16 21:30 jquery-3.2.0.min.js.gz

One further factor is CPU time consumed on the client decompressing and parsing the content but this is likely to be insignificant compared to the network or local IO time, if a device's CPU is under-powered enough that this is significant then it is unlikely to be able to run the decompressed code with useful performance.

ricardobeat · 8 years ago

Most of the gains in there are from stripping out comments. That plus whitespace removal gets you most of the benefit. I don't think the parent was advocating for dropping minification completely, but investing massive effort when you're already at the crest of the curve.

mattmanser · 8 years ago

Are you sure that's not just comments being removed?

remy_luisant · 8 years ago

I had an article about that too. If you have to do just one, you should go with zopfli or brotli instead of minifying. Having both minification and some kind of compression on top does help the file sizes.

https://luisant.ca/brotli-css

Also, purifycss and uncss are your friends to cut stuff down, to reduce the final load for the user.

zuzun · 8 years ago

> So minifying your stuff will only help you a little.

True, the difference between 10KB compressed and 7KB minified+compressed is negligible for your visitors, but it still takes 30% off of your traffic bill.

pasta · 8 years ago

This might be the only valid reason. But only for website that have a huge amount of traffic.

Klathmon · 8 years ago

One massive thing minifying does is dead code elimination (slightly less applicable to CSS but it still applies using some build stacks)

We can build a "prod" version of the app and the minifying process will drop all the debugging code as well as any unused or uncalled functions from the output.

billyhoffman · 8 years ago

What JS Minifier do you know that does dead code elimination?

I would have thought that understanding what functions of a dynamic language that can be safely removed would require parsing/AST analysis beyond those found in the typical minifier.

dottedmag · 8 years ago

I have seen a script which was 250kb gzipped, but 125k minified+gzipped.

It was a script embedded to the other people's pages (and yes, it delivered substantial functionality, it was not just a tracker), so minification saved a lot of traffic/money for the company.

bennofs · 8 years ago

Minifying also speeds up decompression, because less data has to be produced by the decompressor. Compression and minifying are really different optimizations, as the minification does not need to be reversed. So each one has benefits.

the_duke · 8 years ago

It will still have a small impact:

-) Less work for decompression

-) Less total length means lexing+parsing will be a bit quicker

-) shorter class names will also mean a lower memory consumption because of shorter strings, and ideally fewer allocations if some pooling or smart allocator is used

But those points can probably be completely ignored, since JS is a way way bigger factor.

dkersten · 8 years ago

Does minification speed up parsing (less characters to tokenise)? If so, then minification+compression would be better than compression alone as it would make up a bit for the time spent decompressing.

bigbugbag · 8 years ago

Gzip compression over https is a vulnerabilty[1].

Depending on the scale, shaving a few kB here and there can amount to significant savings in the long run.

[1]: https://en.wikipedia.org/wiki/BREACH

psi-squared · 8 years ago

I am not a security researcher, but I think you could keep the benefits of both compression and security, as long as you're careful on the server side:

Say you have a document structured like [boring data] [secret data] [boring data]. I don't know if any existing compressor lets you do this, but the gzip file format (really the 'deflate' format used inside it) allows you to encode this (schematically) as follows:

[compressed boring data] || [uncompressed secret data] || [compressed boring data]

where each || is i) a chunk boundary (the Huffman compression stage is done per-chunk, so this avoids leaks at that level), and ii) a point where the encoder forgets its history - ie, you simply ban the encoder from referencing across the || symbols.

If you wanted, you could even allow references between different "boring" chunks (since the decoder state never needs resetting), just as long as you make sure not to reference any of the secret data chunks.

Edit to add: Also, if the "boring" parts are static, you can pre-compress just those chunks and splice them together, potentially saving you from having to fully recompress an "almost static" document just because it has some dynamic content.

flukus · 8 years ago

The other benefit is from combining files and reducing the number of http requests. Minifiers are really needed for that, but the do make for some nicer development workflows.

camus2 · 8 years ago

> The other benefit is from combining files and reducing the number of http requests. Minifiers are really needed for that, but the do make for some nicer development workflows.

debatable with HTTP2 . Furthermore, separate files are easier to cache. If one of them doesn't change it doesn't have to be loaded again. That's my experience with bundles, especially when one uses asynchronous module definition instead of babel, webpack and co.

mifreewil · 8 years ago

Number of HTTP requests is not a concern with HTTP2 server push and multiplexing. In fact it's usually better to have 2 fairly sized files that can be downloaded in parallel rather than 1 large file.

I have a slightly-related question for those of you familiar with Webpack, css modules (css-loader/style-loader), and perhaps React as well: is there any reason not to use the 'default' approach where the styles for the components are simply inserted in a <style> (with unique, generated classnames)?

To be clear: I don't mean philosophical reasons. I personally love letting javascript deal with the 'cascading' part and I don't have a problem with the idea of having styling embedded in the final page.

What I'm curious about is if this has any kind of negative impact on performance, bandwidth, etc. Because the CSS is loaded on the component level, and because Webpack 2 does tree shaking, the page will be guaranteed to only contain CSS for the components that are on the page. And if I'd 'lazy-load' parts of the app, I'd get that benefit for my CSS as well with no extra effort.

On the other hand, any benefits of having a compiled (and hopefully cached) bundle.css are offset by the need for an extra request for the css file, as well as the very likely situation that there'll be a bunch of unused css in that bundle.

Am I missing some drawback to the above-mentioned approach?

bastawhiz · 8 years ago

When you're using a loader, the CSS still exists, it's just a big string in your JS bundle. By default, I believe css-loader/style-loader will use cssnano to minify the CSS within your bundle.

What will be very interesting in the coming years (as the work gets done around it) is "full css" optimization. That is, when you know you have all of the styles for the whole page available to the minifier. If the minifier knows that no other CSS is being loaded, it can do a lot more work to remove and merge rulesets. In the case of styles bundled with Webpack, common CSS could be reduced even further, after tree shaking has taken place.

mercer · 8 years ago

In the long run I think we're more likely to end up with a full js-based styling approach that, similar to JSX, might look like CSS but really directly styles individual nodes and 'manages' them.

But this is probably quite a ways off.

WorldMaker · 8 years ago

In a past life a website I worked on had a huge browser paint performance and content flash issue that was eventually cleared out by moving all the styles out of <style> tags in the DOM.

bastawhiz · 8 years ago

Most webpack loaders put them in the <head>

snitko · 8 years ago

Embedding css into a webpage forces this css to be loaded every time. Even if the css only contains what's needed for the page, it can still be a lot. Caching it is just common sense. I think if you try and calculate, there will be a lot of bandwidth saved if you separate css into a file.

ricardobeat · 8 years ago

Not true. The CSS is inlined as strings in the js file, and benefits from caching just as much as the rest of your templates (this is for single-page apps).

josephg · 8 years ago

The scientific notation one is a bug. Scientific notation isn't part of the CSS spec[1], and its not supported in all browsers.

I learned this one the hard way a few months ago. We ran into a flexbox bug in one browser which we worked around by adding some-rule: 0.0000001px instead of 0px. However, our minifier collapsed that using scientific notation, which triggered a rendering issue in a different browser due to the out-of-spec CSS. The whole adventure left me feeling like I'd travelled back in time.

[1] https://www.w3.org/TR/CSS21/syndata.html#numbers

legulere · 8 years ago

Scientific notation seems to be part of CCS3:

https://www.w3.org/TR/css3-values/#numbers

Which browser had problems with it?

masklinn · 8 years ago

"q" is also CSS3, incidentally, and its background is interesting[0]: it's a mostly japanese metric typographical unit[1], it replaces the point, and is slightly smaller: q is 0.25mm while pt is ~0.3528mm (precisely 1/72th of an inch which is 25.4mm).

[0] http://tosche.net/2013/10/font-size-in-the-metric-system_e.h...

[1] although non-japanese typographers like Otl Aicher have recommended its use it doesn't seem to have had much success outside Japan

TazeTSchnitzel · 8 years ago

> Scientific notation isn't […] supported in all browsers.

This has tripped me up many a time when I've created CSS colour strings (mostly for <canvas> use) by concatenating Strings and Numbers in JavaScript. When a Number gets small enough, it ends up in scientific notation, and the CSS parser rejects it.

rectangletangle · 8 years ago

That's a hell of a bug, always an adventure when the error locality is really far.

Author here.

Wow. #1 on HN. Wow.

I'd usually hang around a bit more, but I'm really tired. I posted this past my midnight. 00:51 now, and I'm fading fast.

Thanks for all the love, everyone. I'll come over tomorrow (12 hours from now, or so) to answer any questions or to pick up any corrections.

catenthusiast · 8 years ago

I hope you sleep well.

cornedor · 8 years ago

> I'm guessing that at nine nines that is pretty much a one anyway and it would not even change a single pixel on the screen.

There used to be a bug with flex-wrap: wrap; where an element would wrap to the next line while it should have fit. You could fix it by instead using width: 25%; use width: 24.999999%; so it would be 25% on the screen but it would fix the problem so it didn't wrap to the next line. So you should look out with this.

b34r · 8 years ago

That sounds like the time-honored single whitespace bug. Usually fix it with margin-left: -4px on all but the first element in the row.

Nice catch. Yeah, I do something like that too in my own code, though it does not cause issues when minified.

replete · 8 years ago

Don't mean to squash any enthusiasm, but these types of 1byte optimization savings don't really have real-world benefits due to over-the-wire compression like gzip and Brotli.

A more interesting problem to solve, I think, is that of optimising CSS rules for browser rendering.

phire · 8 years ago

I think that there is merit in designing a minifier that is explicitly designed to optimise the gziped size rather than the uncompressed size.

Things like:

* Rearrange rules within the file to put similar rules within the sliding window.

* Rearrange rules so that tail of the last declaration of one rule and the start of the next selector create the longest possible common substring.

* Rearrange the order of declarations within the rules to maximize the length of common substrings that span two declarations, ie ": 2em;\nbackground-color: rgb("

I'm working on one, though I'm not sure if it will ever see the light of the day. I have four months of free time and if someone would feed and house me for that time, I'd do it and open source it.

Any sponsors? No? Didn't think so. Not even you, big G? Aww...

For now some minifiers do sort the values, which helps.

CapacitorSet · 8 years ago

I partly agree. Though removing one or two bytes more than another minifier doesn't really matter that much, what matters is being able to deduplicate CSS as well as doing the usual whitespace elimination. SASS and SCSS seem to have a bit of a problem with duplicated CSS.

52-6F-62 · 8 years ago

It's funny you should say that. I'm curious, do you have an example?

I'm engaging in this ugly probably unsightly (but helpfully quick and maintainable) practice in a project with a short developmental cycle right now, and I've yet to have any issues outside of temporarily forgetting that I have already globally defined a specific style or enclosed a style I thought I'd left global.

(It's a corp. annual report -- that my team got tasked with as a favour -- so it has some repeating styles, and others isolated between pages)

One of the main reasons these optimizations happen is actually to make compression better. If you have a mix of px/pt/cm/mm in your stylesheet, it's more than likely that making them consistent (in their smallest possible form) makes them more uniform, making them more compressible.

I used to have a doc with actual numbers, but I've since lost it. If I dig it up, I'll link it here.

Silhouette · 8 years ago

Here's the same author's earlier post on this subject, "The missed chances: What minifiers leave behind", from last week:

https://luisant.ca/css-opts-survey

cfqycwz · 8 years ago

Anybody know if the transparency one is actually a desirable optimization? Iirc, you might want to assign a color to your transparency so it's not shifting hue as you fade it in through CSS transitions, animations, or JS.

Author of crass here. That's interesting: if you have a specific case where the browser doesn't do what you'd expect, I'd love to see it in a Github issue!

https://github.com/mattbasta/crass/issues/new

ovao · 8 years ago

crass is doing some really wonderful stuff here -- I'm impressed!

It's very interesting, however, that no one minifier is a consistent winner in these test cases, and that running CSS through multiple minifiers is actually, potentially, not all that crazy. (The very debatable real value in doing that notwithstanding.)

Yeah, crass is pretty amazing.

Have you seen my post on the Remynifier, where I do exactly that?

https://luisant.ca/remynifier

I appreciate the compliment! (author of crass here)

I've mentioned it before, but it's really not a great idea to use multiple minifiers. Minifier bugs can get nasty, and using multiple minifiers exponentially increases the likelihood that you'll encounter some weird or broken behavior. Make sure to test thoroughly.

And I will mention it one more time: I totally agree with you. It needs to be said.

abritinthebay · 8 years ago

Really? It seems that CSSO was the strong winner.

It didn't possibly create bugs by rewriting to new units (especially poorly supported units like q) and had the best results overall.

I'd like it to be a wee bit more aggressive on the rounding but other than that it seemed a clear winner.