You mention "another packing optimization". I'm wondering, how are you transferring frames? The dot matrix is eight 7x5 characters, i.e. 280 bits in total, which amounts to 40 7-bit groups per frame. You seem to be using twice that space in transmission, is it wasted on some control data or is the transmission just slightly suboptimal?
The dot matrix is actually eight 5x8 characters, or 320 bits in total. I'm packing those 320 bits into the the 4 bits per byte that are available to us in this shell protocol. Plus, another 9 bytes for the packet header and footer. Looks like I wrote 92 in the article, I must have miscalculated that.
I'm not using the full 7 bits because figuring out a way to do so turned out to be way too hard for me, so I opted for a solution that is negligibly worse than the optimal one, in comparison to the original one.
If you're wondering about the exact algorithm, consider checking these files out, but please keep in mind that I haven't cleaned the code up yet: https://github.com/portasynthinca3/swl01u/blob/master/fun/bi..., https://github.com/portasynthinca3/swl01u/blob/master/fun/ba...
Deleted Comment
I really like this comparison!