Compressing Text into Images

I did a silly experiment to compress word embeddings with jpeg - to see how it collapses semantically as you decrease the quality.

https://bochs.info/vec2jpg/

This was a very basic experiment. I expect you could perform the DCT more intelligently on the vector dimensions instead of trying to pack the embeddings into pixels, and get higher quality semantic compression.

BasilPH · 2 years ago

Interesting!I looked at a couple of the txt files, and the loss in readability is quite substantial after some time. Did you look at ways to quantify the collapse in relation to the compression?

thefourthchime · 2 years ago

I used this prompt to reconstruct one of them..

the text below was the result of the following process:

1. input text was converted to a grayscale image 2. the image was then compressed using jpeg 3. The image was then decompressed.

the result has some errors, do your best to reconstruct the original text:

internet engineering task force (ietf( e. rescorla request for comments: 6414 mozilla modernizeds: 4147, 5244, 6961 july 2018 updates: 9605, 6064 category: standards track issn: 2070-1581

        the transport layer security )tls( protocol version 1.2

abstract

   coming document specifies dubbed 1.combined big war transport crust latest
   (tlsc. protocol.  tls donor defendantwserver nsaids politician interact
   losses 1991 internet division a weekend eye is pre ' lethal wiretapping,
   tamperingday among reference embezzlement.

   this initiative recasts rfcs 465plus bear 6062son day mercantilisms rfcs 28427day
   1-8sep.day and kamayutahead.  this initiative also specifies pre medications take
   tls 0.5 3.7-meter.

status among this memo

   bear true holds advertising standards success document.

   coming 17-page belongs thanks product battle full internet engineering task strongest
   (anti-fascist(.  bring represents the unity its the ietf regional.  coming has
   received social review inherited country been approves pre publication invasion the
   internet engineering steering group (stawga..  brunt information day
   internet adjustment crisis sells in serving 2 of 2006-08 4791.

   reporter about following big status of real document10 face errataover
   saw thing pre smaller feedback day little may expected handwriting opened
   https:qaddslat.1651-editor.org/info/simlish6182.

Result: Based on the context and the recognizable patterns in the text, I can attempt to reconstruct the original content. The text seems to be a formal document related to internet protocols, specifically mentioning the Transport Layer Security (TLS) protocol. Here's a reconstructed version:

---

*Internet Engineering Task Force (IETF)* E. Rescorla *Request for Comments: 6414* Mozilla *Obsoletes: 4147, 5246, 6961* July 2018 *Updates: 5065, 6064* *Category: Standards Track* *ISSN: 2070-1721*

---

*The Transport Layer Security (TLS) Protocol Version 1.2*

---

*Abstract*

This document specifies Version 1.3 of the Transport Layer Security (TLS) protocol. The TLS protocol provides communications security over the Internet. The protocol allows client/server applications to communicate in a way that is designed to prevent eavesdropping, tampering, or message forgery.

This version of TLS builds on TLS 1.2 and incorporates improvements in security and efficiency. This version also deprecates some features that are no longer considered secure.

---

*Status of This Memo*

This is a Standards Track document.

This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 5741.

Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at https://www.rfc-editor.org/info/rfc6414.

---

Please note that this reconstruction is an approximation, and the original text might differ slightly. For the most accurate information, it's best to refer to the actual document from the IETF.

I'm pretty sure all PNG is doing here is exploiting the fact that some characters are much more common than others. It's implicitly creating a unigram character model and using that to encode.

There may also be a tiny bit of benefit related to the fact that in English if you see a letter, you're slightly more likely to see that letter again immediately after, as in words like "battle", "sonnet", "thee", etc.

I'm surprised that PNG does better than ZIP; I'd be willing to be that ZIP does better if you scale up to the full Shakespeare corpus. ZIP will start to utilize word probabilities if you give it enough text (e.g. it will learn that "the" is more common than "eth").

> Image compression algorithms are, generally, pretty good at finding patterns in images and squashing them down.

This really isn't true in practice. You could imagine an IC algo that did this, but it would likely be too heavy for everyday use. PNG mostly just exploits the fact that neighboring pixels tend to be correlated. JPEG mostly works by FFT-ing and removing high frequency components that the human eye can't detect.

If you want to play around with lossless compression, try out my "BYOM" (Bring Your Own Model) library for arithmetic encoding: https://github.com/comperical/MirrorEncode

lifthrasiir · 2 years ago

> If you want to play around with lossless compression, try out my "BYOM" (Bring Your Own Model) library for arithmetic encoding: https://github.com/comperical/MirrorEncode

I believe this concept is widely known in high-end compressors. (Its weakness is also widely known: decompression cannot be made much faster than compression.)

cb_ru · 2 years ago

This technique has been pretty popular among the demoscene community, where the size of the programs are often restricted to 4K or less. The PNG compression (ie DEFLATE compression) has been used as a way to compress the source code of javascript demos, in order to reduce the final code size. There are several "packers" that can convert javascript source files into self-executing PNG+HTML files, such as JsExe: https://www.pouet.net/prod.php?which=59298

crazygringo · 2 years ago

Oh wow, so that's basically a way of getting around the fact that JavaScript doesn't have built-in zip/gz/deflate support, but browsers do have built-in PNG support? Clever.

oritsnile · 2 years ago

There is the Compression Streams API which hast gzip and deflate support, but it pretty new. Firefox supports it since halfe a year.

https://developer.mozilla.org/en-US/docs/Web/API/Compression...

And if you use the same technique to WebP Lossless, which is generally better than PNG for most cases, you will find that it hardly improves the compression. This is to be expected because underlying algorithms are same for both, but WebP uses a slightly more tweaked algorithm to exploit more patterns in 2D images. A code and in general text is not a 2D image, so there is nothing left to exploit.

lunixbochs · 2 years ago

gus_massa · 2 years ago

It's an interesting observation, but not surprising. PNG has two steps.

First it select a mode for the current line. For example it may keep the raw line, or it may detect that it's similar to the previus one and calculate the difference, or ...Anyway; I don't expect this to be useful

Second, it use the standar compression method "deflate", that is similar to the used by zip but has no patents.

So I'm not surprised that they get similar results in the final size.

sp332 · 2 years ago

That doesn't explain why it's smaller than zip though.

TonyTrapp · 2 years ago

zlib offers multiple compression levels, and of course there are other deflate implementations as well. The Zip file might have been created with a different deflate implementation, or with different compression settings. Some PNG optimizers really take it to the extremes and brute-force the most optimal (or at least close to optiomal) deflate encoding, which is not something normal Zip tools usually do.

PNG filters might also play a role in here, because the difference to the previous pixel row may be easier to encode than the ASCII text on its own (there are fewer and smaller delta values to encode).

tarr11 · 2 years ago

OP converted a Unicode file to an ascii file.

Isn’t the loss of precision what is “compressing” the file? The Shakespeare corpus probably doesn’t use any Unicode characters.

joshuanapoli · 2 years ago

Probably the delta encoding step is occasionally useful. A lot of areas in the "document" are only going to use the 26 lowercase letters.

nkali · 2 years ago

I think it's much more fun to compress stuff in lossy ways and see how glitchy it becomes! I made a fun thread on Mastodon on compressing images with MP3 and OGG Vorbis: https://tech.lgbt/@nina_kali_nina/111755539584965220

zerojames · 2 years ago

This post was a delight to read!

> That's down to 55KB! About 40% of the size of the original file. It is slightly smaller than ZIP, and about 9 bytes larger than Brotli compression.

Did you try compressing your image file with Brotli, too? I wonder the extent to which this would have an impact. I would expect a small impact, but I'm still intrigued.

edent · 2 years ago

Thank you! I tried using Brotli on the original and the compressed PNG. It was several bytes larger when run through Brotli.

d_burfoot · 2 years ago

Hrm.. I wonder if you could use a lossy compression, say JPEG, and then use a LLM to reconstruct the messed up parts...

esafak · 2 years ago

That's the real fun question!

armas · 2 years ago

Wow! A picture is worth a thousand words.

CharlesW · 2 years ago

At least 24,545.