Readit News logoReadit News
pjc50 · 2 years ago
Some years ago (1996) I wrote a GIF decoder from scratch based entirely on http://qzx.com/pc-gpe/gif.txt (and a description of LZW bundled with my copy of pcgpe that seems to be missing from that page) . I remember it being quite a struggle against off-by-one errors.

I do wish there was a really good format for describing binary file formats in a way that was amenable to codegen. Kaitai https://kaitai.io/ seems to be the state of the art.

yjftsjthsd-h · 2 years ago
Kaitai looks nice - have you used it enough to review how it handles? I'm just starting a project to deal with somewhat involved on-disk formats[0] and this might be helpful.

[0] The other day, someone was asking for a "tar2ext4" tool, and I thought "hey, that should exist, and I need a side project!". I was prepared to use an annotated hex viewer ( https://hachoir.readthedocs.io/en/latest/wx.html ) and hand roll the encoder/decoder, but I'll happily take tool assistance:)

pjc50 · 2 years ago
Kaitai is good if:

- your format is fully known (it's less helpful if you're trying to incrementally build a parser while reverse engineering)

- you want to read files, but don't care about writing

- you don't mind that the development is not very active

For writing "tar2ext4" I would genuinely look at how much work it would be to run the ext4 code from the kernel in a different context; there's a lot of it to consider. Or do what the Apple "dmg" tooling does and make a ramdisk.

mikecx · 2 years ago
Love learning about image/video formats and I enjoy how you broke it all down. Just a heads up, pretty sure you've got a typo:

"Next is the Global Packed Field, which in this case is 70 which in binary form is 00000000."

70 in binary would be 100110. (64 + 0 + 0 + 4 + 2 + 0)

jsf01 · 2 years ago
This pops up again later when he says 81 is 10000001. Or that width 8 corresponds to 04. I don’t know enough about the gif format to know if I just misunderstood these parts or if they were written incorrectly, but it was a bit confusing.
happybits · 2 years ago
81 in hex is 10000001 in binary, so I think this part is correct
happybits · 2 years ago
Thanks mikecx, it was a typo, I've changed the text to "Next is the Global Packed Field, which in this case is 00..."
happybits · 2 years ago
CUSTOMER

Hackerman, I need an impressive icon for my website. It should be 5x5 pixels big and look like a rabbit. Can you please draw it for me?

HACKERMAN

Draw!? Bah! I don’t need any graphic program for that. I am Hackerman. I will code it for you. You will get the image next week.

CUSTOMER

Next week?? But…

HACKERMAN

No buts! I just need to read about how the GIF file format works, then I can create the image in no time.

[TIME PASSES]

After spending some evenings, Hackerman gets the main idea of how the GIF file format works and the compression algorithm called LZW. With that knowledge, he succeeded in creating the image within an hour.

Hackerman calculated that the binary of the image should be as follows:

47 49 46 38 39 61 00 00 00 00 70 00 00 2c 00 00 00 00 05 00 05 00 81 11 11 11 FF FF FF D5 D7 D9 00 00 00 07 0F 80 01 00 83 01 82 84 85 88 82 8A 85 02 85 81 00 3b

So he just opened his code editor, saved the file as rabbit.gif, and sent it to his customer. Boom! Easy-peasy!

Do you want understand the GIF-file format and be as cool as Hackerman?

atoav · 2 years ago
I teach a foundational media technology course at on of the bigger european art universities — I do the same thing with the students using a broadcast wave file.

The goal of the thing isn't to turn them into hackers, it is to give them a feeling what the stuff they work with is made of, what a file is. This is also a great introduction to talk about compression, metadata, encoding, decoding, sample rate, bitdepth and so on.

If you dive that deep into it, the settings in a typical media conversion program will suddenly become much less intimidating. My motto always was: this was made by humans so it should be possible for humans to understand it as well. And this is maybe the "hidden" lesson: If you bring enough patience you can go into the depth of nearly every topic.

JohnFen · 2 years ago
I absolutely love that you do this. There appears to be a trend to discount the value of knowing the low-level details of these sorts of things -- but even if you never actually work at that level, knowing what's happening "behind the scenes" increases and deepens understanding and can make high-level things that seem arbitrary or nonsensical make sense.
philsnow · 2 years ago
I forgot most of what little I learned in semiconductor physics and analog design courses in college, but I remember enough to know that nothing about computers is magic. Nothing. That knowledge gives me confidence that I can figure out anything having to do with computers given enough quiet time.

On the other hand, I don't really know much about cars. If I had a 2024 model year car and something went wrong with it, I'd very quickly run up against a wall of arcane-seeming knowledge that I don't have easy access to. I don't know where I would start to learn everything about cars from scratch like I did with computers.

There might be a name for this, but I don't know what it is.

happybits · 2 years ago
atoav: this sounds like at great exercise. Have you written any public article about that?

The main reason for writing this article was * My curiosity - what secret is behind all strange bytes in a graphical file * To spend some time doing something weird, that I know have zero effect of my career and no chance of giving and benefits or profit in the future.

Dead Comment

jolmg · 2 years ago
That first

> 00 00 00 00

should be `05 00 05 00`.

Brentward · 2 years ago
Yeah the article says these "aren't used anymore," but my image viewer (sxiv) complained that the gif was corrupted until I put something there. It still displayed the image, but it also complained afterwards.
bluejekyll · 2 years ago
Yes? Is that such a bad thing? Is this trying to say there is no value in learning something so low level. Exploration leads to learning, and learning leads to innovation. Perhaps Hackerman will go on to create the greatest image encoding library for images so that they can easily scale from 5x5 to 25x25 or more and fit in the same space as the 5x5. Who knows.
loloquwowndueo · 2 years ago
He’s quoting from TFA.
jszymborski · 2 years ago
Defeats the purpose a bit, but its far easier to type out a PBM [0] file by hand and then just convert it to gif using e.g. imagemagick

[0] https://en.wikipedia.org/wiki/Netpbm

fwip · 2 years ago
There's a cute plugin[0] for Vim which converts any image to XPM, which is a similar format that Vim has syntax-coloring for. You can edit the text, and then on save, it will get converted back to the original format. I've used it a few times to quickly preview an image or edit a favicon. It's more of party trick than seriously useful, though.

[0]https://github.com/tpope/vim-afterimage

yjftsjthsd-h · 2 years ago
Not even just manually typing... Last time I wanted to have a program save a picture [0] it was easiest to write PPM and then convert that to a real format. Super inefficient file sizes, but good tradeoff for a hobby project. I can take some big intermediate files in exchange for not needing a graphics file format library:)

[0] I was playing with the Linux framebuffer and wrote - among other things - a screenshot tool.

TacticalCoder · 2 years ago
Wait, TFA doesn't even contain a link to this vid from Hackerman hacking time!?

https://youtu.be/KEkrWRHCDQU

(my favorite part is when he goes into hardcore hacking mode while putting a Nintendo glove on)

kristopolous · 2 years ago
What was the follow up content that's no longer available?
Moru · 2 years ago
Eh, yes it did :-)
jtaft · 2 years ago
Reminds me of graphic designer vs css programmer

https://youtube.com/shorts/YWT8Dqd-AmQ?feature=shared

mock-possum · 2 years ago
> Next is the Global Packed Field, which in this case is 70 which in binary form is 00000000

… what? 70 should be 1000110 surely?

charlieyu1 · 2 years ago
70 in hex. So 01110000. The 111 bits are unused according to the article so it can be anything.
follower · 2 years ago
One great resource for GIF-related explorations is Matthew Flickinger's "What's In A GIF" project:

* https://www.matthewflickinger.com/lab/whatsinagif/index.html

The original version is apparently from ~2005 and is used as the basis of the giflib docs referenced by the original article[0]. (The giflib docs do expand on the content of the original, so are still worth reading.)

But Matthew Flickinger's original version has continued to be updated as recently as 2022[1] and now includes two helpful browser-based GIF tools:

* GIF Explorer: https://www.matthewflickinger.com/lab/whatsinagif/gif_explor...

* GIF Encoder: https://www.matthewflickinger.com/lab/whatsinagif/gif_encode...

GIF Explorer displays the "interpreted" bytes of any GIF file in an almost "literate" style and has an UI/UX which I'd be really interested to see used in a generic reverse-engineering/binary viewer tool.

GIF Encoder enables you to create an image in the browser & see how it is GIF encoded.

I have a rant about how modern GIF usage could be so much better than it is (and still be within the original specification) but instead of subjecting you to that I'll subject you to this project of mine instead: https://audiogif.rancidbacon.com

[0] https://giflib.sourceforge.net/whatsinagif/index.html

[1] https://github.com/MrFlick/whats-in-a-gif

akdas · 2 years ago
I absolutely love the "What's In A GIF" series. It's what inspired me to write my own GIF decoder while learning Erlang at the same time: https://github.com/avik-das/giferly

The first time around, I struggled a lot with decoding errors. Many years later, after being a more experienced developer, I wrote the LZW decompression with unit tests. Doing so forced me to think about each edge case, and fix issues without breaking existing functionality. Very quickly, I was able to open pretty much any GIF file I threw at it!

happybits · 2 years ago
Thanks follower

I've read his posts about GIF and referred to it at the end of my article. But I didn't know about "GIF Encoder" and "GIF Explorer" - interesting!