This technique has been pretty popular among the demoscene community, where the size of the programs are often restricted to 4K or less. The PNG compression (ie DEFLATE compression) has been used as a way to compress the source code of javascript demos, in order to reduce the final code size. There are several "packers" that can convert javascript source files into self-executing PNG+HTML files, such as JsExe: https://www.pouet.net/prod.php?which=59298
Oh wow, so that's basically a way of getting around the fact that JavaScript doesn't have built-in zip/gz/deflate support, but browsers do have built-in PNG support? Clever.
And if you use the same technique to WebP Lossless, which is generally better than PNG for most cases, you will find that it hardly improves the compression. This is to be expected because underlying algorithms are same for both, but WebP uses a slightly more tweaked algorithm to exploit more patterns in 2D images. A code and in general text is not a 2D image, so there is nothing left to exploit.
This was a very basic experiment. I expect you could perform the DCT more intelligently on the vector dimensions instead of trying to pack the embeddings into pixels, and get higher quality semantic compression.
Interesting!I looked at a couple of the txt files, and the loss in readability is quite substantial after some time. Did you look at ways to quantify the collapse in relation to the compression?
the text below was the result of the following process:
1. input text was converted to a grayscale image
2. the image was then compressed using jpeg
3. The image was then decompressed.
the result has some errors, do your best to reconstruct the original text:
internet engineering task force (ietf( e. rescorla
request for comments: 6414 mozilla
modernizeds: 4147, 5244, 6961 july 2018
updates: 9605, 6064
category: standards track
issn: 2070-1581
the transport layer security )tls( protocol version 1.2
abstract
coming document specifies dubbed 1.combined big war transport crust latest
(tlsc. protocol. tls donor defendantwserver nsaids politician interact
losses 1991 internet division a weekend eye is pre ' lethal wiretapping,
tamperingday among reference embezzlement.
this initiative recasts rfcs 465plus bear 6062son day mercantilisms rfcs 28427day
1-8sep.day and kamayutahead. this initiative also specifies pre medications take
tls 0.5 3.7-meter.
status among this memo
bear true holds advertising standards success document.
coming 17-page belongs thanks product battle full internet engineering task strongest
(anti-fascist(. bring represents the unity its the ietf regional. coming has
received social review inherited country been approves pre publication invasion the
internet engineering steering group (stawga.. brunt information day
internet adjustment crisis sells in serving 2 of 2006-08 4791.
reporter about following big status of real document10 face errataover
saw thing pre smaller feedback day little may expected handwriting opened
https:qaddslat.1651-editor.org/info/simlish6182.
Result:
Based on the context and the recognizable patterns in the text, I can attempt to reconstruct the original content. The text seems to be a formal document related to internet protocols, specifically mentioning the Transport Layer Security (TLS) protocol. Here's a reconstructed version:
---
*Internet Engineering Task Force (IETF)* E. Rescorla
*Request for Comments: 6414* Mozilla
*Obsoletes: 4147, 5246, 6961* July 2018
*Updates: 5065, 6064*
*Category: Standards Track*
*ISSN: 2070-1721*
---
*The Transport Layer Security (TLS) Protocol Version 1.2*
---
*Abstract*
This document specifies Version 1.3 of the Transport Layer Security (TLS) protocol. The TLS protocol provides communications security over the Internet. The protocol allows client/server applications to communicate in a way that is designed to prevent eavesdropping, tampering, or message forgery.
This version of TLS builds on TLS 1.2 and incorporates improvements in security and efficiency. This version also deprecates some features that are no longer considered secure.
---
*Status of This Memo*
This is a Standards Track document.
This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 5741.
Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at https://www.rfc-editor.org/info/rfc6414.
---
Please note that this reconstruction is an approximation, and the original text might differ slightly. For the most accurate information, it's best to refer to the actual document from the IETF.
It's an interesting observation, but not surprising. PNG has two steps.
First it select a mode for the current line. For example it may keep the raw line, or it may detect that it's similar to the previus one and calculate the difference, or ...Anyway; I don't expect this to be useful
Second, it use the standar compression method "deflate", that is similar to the used by zip but has no patents.
So I'm not surprised that they get similar results in the final size.
zlib offers multiple compression levels, and of course there are other deflate implementations as well. The Zip file might have been created with a different deflate implementation, or with different compression settings. Some PNG optimizers really take it to the extremes and brute-force the most optimal (or at least close to optiomal) deflate encoding, which is not something normal Zip tools usually do.
PNG filters might also play a role in here, because the difference to the previous pixel row may be easier to encode than the ASCII text on its own (there are fewer and smaller delta values to encode).
I think it's much more fun to compress stuff in lossy ways and see how glitchy it becomes! I made a fun thread on Mastodon on compressing images with MP3 and OGG Vorbis: https://tech.lgbt/@nina_kali_nina/111755539584965220
> That's down to 55KB! About 40% of the size of the original file. It is slightly smaller than ZIP, and about 9 bytes larger than Brotli compression.
Did you try compressing your image file with Brotli, too? I wonder the extent to which this would have an impact. I would expect a small impact, but I'm still intrigued.
I'm pretty sure all PNG is doing here is exploiting the fact that some characters are much more common than others. It's implicitly creating a unigram character model and using that to encode.
There may also be a tiny bit of benefit related to the fact that in English if you see a letter, you're slightly more likely to see that letter again immediately after, as in words like "battle", "sonnet", "thee", etc.
I'm surprised that PNG does better than ZIP; I'd be willing to be that ZIP does better if you scale up to the full Shakespeare corpus. ZIP will start to utilize word probabilities if you give it enough text (e.g. it will learn that "the" is more common than "eth").
> Image compression algorithms are, generally, pretty good at finding patterns in images and squashing them down.
This really isn't true in practice. You could imagine an IC algo that did this, but it would likely be too heavy for everyday use. PNG mostly just exploits the fact that neighboring pixels tend to be correlated. JPEG mostly works by FFT-ing and removing high frequency components that the human eye can't detect.
If you want to play around with lossless compression, try out my "BYOM" (Bring Your Own Model) library for arithmetic encoding:
https://github.com/comperical/MirrorEncode
> If you want to play around with lossless compression, try out my "BYOM" (Bring Your Own Model) library for arithmetic encoding: https://github.com/comperical/MirrorEncode
I believe this concept is widely known in high-end compressors. (Its weakness is also widely known: decompression cannot be made much faster than compression.)
https://developer.mozilla.org/en-US/docs/Web/API/Compression...
https://bochs.info/vec2jpg/
This was a very basic experiment. I expect you could perform the DCT more intelligently on the vector dimensions instead of trying to pack the embeddings into pixels, and get higher quality semantic compression.
the text below was the result of the following process:
1. input text was converted to a grayscale image 2. the image was then compressed using jpeg 3. The image was then decompressed.
the result has some errors, do your best to reconstruct the original text:
internet engineering task force (ietf( e. rescorla request for comments: 6414 mozilla modernizeds: 4147, 5244, 6961 july 2018 updates: 9605, 6064 category: standards track issn: 2070-1581
abstract status among this memo Result: Based on the context and the recognizable patterns in the text, I can attempt to reconstruct the original content. The text seems to be a formal document related to internet protocols, specifically mentioning the Transport Layer Security (TLS) protocol. Here's a reconstructed version:---
*Internet Engineering Task Force (IETF)* E. Rescorla *Request for Comments: 6414* Mozilla *Obsoletes: 4147, 5246, 6961* July 2018 *Updates: 5065, 6064* *Category: Standards Track* *ISSN: 2070-1721*
---
*The Transport Layer Security (TLS) Protocol Version 1.2*
---
*Abstract*
This document specifies Version 1.3 of the Transport Layer Security (TLS) protocol. The TLS protocol provides communications security over the Internet. The protocol allows client/server applications to communicate in a way that is designed to prevent eavesdropping, tampering, or message forgery.
This version of TLS builds on TLS 1.2 and incorporates improvements in security and efficiency. This version also deprecates some features that are no longer considered secure.
---
*Status of This Memo*
This is a Standards Track document.
This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 5741.
Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at https://www.rfc-editor.org/info/rfc6414.
---
Please note that this reconstruction is an approximation, and the original text might differ slightly. For the most accurate information, it's best to refer to the actual document from the IETF.
First it select a mode for the current line. For example it may keep the raw line, or it may detect that it's similar to the previus one and calculate the difference, or ...Anyway; I don't expect this to be useful
Second, it use the standar compression method "deflate", that is similar to the used by zip but has no patents.
So I'm not surprised that they get similar results in the final size.
PNG filters might also play a role in here, because the difference to the previous pixel row may be easier to encode than the ASCII text on its own (there are fewer and smaller delta values to encode).
Isn’t the loss of precision what is “compressing” the file? The Shakespeare corpus probably doesn’t use any Unicode characters.
> That's down to 55KB! About 40% of the size of the original file. It is slightly smaller than ZIP, and about 9 bytes larger than Brotli compression.
Did you try compressing your image file with Brotli, too? I wonder the extent to which this would have an impact. I would expect a small impact, but I'm still intrigued.
There may also be a tiny bit of benefit related to the fact that in English if you see a letter, you're slightly more likely to see that letter again immediately after, as in words like "battle", "sonnet", "thee", etc.
I'm surprised that PNG does better than ZIP; I'd be willing to be that ZIP does better if you scale up to the full Shakespeare corpus. ZIP will start to utilize word probabilities if you give it enough text (e.g. it will learn that "the" is more common than "eth").
> Image compression algorithms are, generally, pretty good at finding patterns in images and squashing them down.
This really isn't true in practice. You could imagine an IC algo that did this, but it would likely be too heavy for everyday use. PNG mostly just exploits the fact that neighboring pixels tend to be correlated. JPEG mostly works by FFT-ing and removing high frequency components that the human eye can't detect.
If you want to play around with lossless compression, try out my "BYOM" (Bring Your Own Model) library for arithmetic encoding: https://github.com/comperical/MirrorEncode
I believe this concept is widely known in high-end compressors. (Its weakness is also widely known: decompression cannot be made much faster than compression.)