Why are QR Codes with capital letters smaller than QR codes with lower case?

When QR codes first came out I thought it was really cool. But then re-entering meatspace after the pandemic I was honestly saddened to see so many in-person venues start using QR codes. QR codes are machine-readable, but they sure aren't human-readable, but why can't we have both? For instance, plain text using a low-pixel font with a dotted line underneath for error-correction and alignment.

achairapart · 6 months ago

Yes, I'd love to see menus printed in OCR-A[0].

[0]: https://en.wikipedia.org/wiki/OCR-A

em3rgent0rdr · 6 months ago

OCR-A looks cool, but my above post isn't saying I want the menus in a machine-readable font, but rather I want the human-relevant parts of the lookup to be both human-readable and machine-readable.

jychang · 6 months ago

It's 2025, regular english is machine readble

guappa · 6 months ago

If a restaurant has no way to allow customers to not use their phone I normally just leave.

eitally · 6 months ago

Generally speaking, I'm with you. However, there is one use case that's exceptional: when you're with a large group where every sub-party will be ordering & paying separately. It can be a godsend to have phone ordering when 25-50 people descend on a restaurant all at once (my typical use case being kids sports teams + family members). It's absolutely not ideal for experiential dining where you're going for ambience as much as the cuisine, but it definitely expedites the ordering process and the ability to keep a tab open is a huge benefit.

laurencerowe · 6 months ago

I find that surprising because in my experience QR code ordering systems are almost always worse than paper menus and ordering at the bar.

brookst · 6 months ago

Not a bad way to make a point locally, but wow are QR codes nice when you’re traveling and don’t speak the language. You get the menu, in a browser, with all of the translation and parsing tools on your phone.

amelius · 6 months ago

Waiting for Apple watch to have a camera.

And __not__ this one: https://wristcam.com

nayuki · 6 months ago

Agreed. I like how most 1D barcodes have human-readable numbers/text printed under the barcode. For example, think of UPC barcodes on retail products. Not many 2D barcodes respect this convention.

DocTomoe · 6 months ago

This is directly caused by UPC codes being numerical and short, while 2D barcode have significantly higher, and often ASCII-space, data density in which human readability does not bring much of an advantage.

dylan604 · 6 months ago

What does 1D barcode mean? I can think of no bar that can be represented in 1D.

pseudosavant · 6 months ago

Whenever people are suggesting adding a QR somewhere (ad, in-app, etc) I always advocate for showing the short URL too. But about half the time they insist that "everyone knows how to scan a QR code". They clearly haven't tried to ask a few people to scan a QR code to see how easily most people do it.

PaulHoule · 6 months ago

I've seen QR codes to join a discord that have text under them that looks like

   discord.gg/{... a few random characters ...}

which are just fine to scan or type in.

My own 'discovery' about QR codes a few years is that you can make them "module 2" sized that ought to be easy to read with a low-spec system and have astronomical capacity if you use uppercase characters, a reasonably short domain and identifiers similar to random UUIDs. These were part of the system of "three-sided cards"

https://mastodon.social/@UP8/111013706271196029

but new-style cards put the QR code in front because (1) I have a huge amount of glossy paper that I can't print on the back of, (2) you can't read the QR code on the back if the card is stuck to the wall with mounting putting, (3) three-sided cards struggled with branding in that people didn't really understand the affordances they offered, a problem that the new-style cards attack in various ways.

https://mastodon.social/@UP8/113541119391897096

(Note the QR codes on both of those cards do not point at safebooru but at a redirect that I control that fits my QRL specification)

Personally I don't think any QR code for the web should ever require more than a "module 2" QR code and that printing a QR code which requires extra alignment markers is a sign of failure. (e.g. sure you can make a QR-code with 2000 bytes of form data embedded in it, but should you? Random UUIDs are so numerous and redirects so cheap that every new-style card like that Yakumo Ran card has a unique id because with inkjet printing it doesn't cost anything more)

lynnharry · 6 months ago

That's basically converting a QR scanner into a text detector. It might work but why do they need to be human-readable? Most part of the encoded string would be UUID that's useless for human eyes anyway. After scanning, the important info will also usually show on the phone screen.

em3rgent0rdr · 6 months ago

I want it human readable so I'm not presented with QR codes when I'm in meatspace. I'd rather see a pixel-font that say MENU://JOES-CHICKEN when I want to look up the menu at my local chicken restaurant than look at a QR.

bitwize · 6 months ago

Or maybe OCR-A? https://en.m.wikipedia.org/wiki/OCR-A

patrikr · 6 months ago

There is no "after the pandemic" (yet). The pandemic is still ongoing. (Source: WHO)

richwater · 6 months ago

No one in the real world actually acknowledges this as being fact. Further pushes by the WHO just lower trust even more.

CamouflagedKiwi · 6 months ago

100%, I remember this being a big pain during the period where places were open but you had to order from the table. If your phone didn't want to scan the code you were kind of stuck - and to make it worse some of them _deliberately_ degraded the code to add a cutesy logo or whatever.

And that's why you should probably use base45 for binary data in QR codes: https://www.rfc-editor.org/rfc/rfc9285.html

dgl · 6 months ago

Some discussion happened here about that when it was in draft: https://news.ycombinator.com/item?id=27628178

tl;dr: It is sadly not the most efficient encoding (and they missed an opportunity to make it actually base41, which could have been URL safe) -- as defined it only needs 41 characters (as 41^3 > 2^16).

The RFC is also not standards track, it's just "Category: Informational".

I think a better approach is to understand there are many circumstances where different sets of characters make sense for encoding data. There's no need to write an RFC, instead define a custom alphabet for them, using something like base-x[1].

[1]: https://github.com/cryptocoinjs/base-x

Dylan16807 · 6 months ago

Or just use numbers. Simpler, no weird symbols, more efficient.

lifthrasiir · 6 months ago

If your original data is not a byte sequence then it would indeed work. Otherwise you have to convert it back to bytes yourself, but no small x exists such that 10^x is just barely smaller than 256^y and bignum would be necessary for efficient encoding. Base45 doesn't need bignum and only incurs ~3.2% overhead [1] compared to the pure octet mode (which might be unsuitable for decoder compatibility and other purposes though).

[1] 32 original bits = 4 original bytes = 6 base45-encoded letters = 33 bits in the alphanumeric mode, so the overhead is 1 - 33/32 = 0.03125 for 4n bytes of data.

Zamicol · 6 months ago

The problem with using base10 is that it cannot encode a URL. Alphanumeric is the simplest QR code encoding mode that can encode a URL.

Also, when I do the math alphanumeric is the most efficient QR mode, although just barely.

cypherpunks01 · 6 months ago

My favorite QR code visual explainer:

https://qr.blinry.org/

tantalor · 6 months ago

Not very useful, as it doesn't support alphanumeric mode.

> For our code, encoding mode is Alphanumeric (2), but we haven't implemented how to read that yet. Sorry! Try another QR code!

SirFatty · 6 months ago

Thank you for that! A lot is over my head, but very interesting none-the-less.

This site is the best visual explainer for QR code generation.

https://www.nayuki.io/page/creating-a-qr-code-step-by-step

It reproduces what the article is saying, with more detail.

rkagerer · 6 months ago

Thanks, this looks like an awesome tool. I wish more web explainers took this "explain your work" approach. It's great that it uses the content you put in to illustrate the step-by-step breakdown.

recursive · 6 months ago

All this bitwise optimization, but most of the QR codes I see in the wild have >100 bytes of useless cruft like https://engagement.bigcompany.com/campaigns/1485-0123/landin...

nomel · 6 months ago

It's not useless, it just makes the tracking data (which it almost always is) easier to implement.

Also makes it harder to actually use because the requirements for camera focus get cranked way up.

fragmede · 6 months ago

thanks for wasting my time! :)

(nomel's profile)

emmelaich · 6 months ago

Another example, Victoria Australia COVID tracking QR code.

https://nick.zoic.org/art/qr-codes-advice/#no-funny-business

thih9 · 6 months ago

It works though. Perhaps it would have been easier to use when optimized, but it’s a nontrivial effort - especially if the original requirements were bloated. To me it’s no different than misusing heavy JS frameworks or an electron app.

kentbrew · 6 months ago

Here's a guy making a QR code by hand on a Go board, just the way the inventors did: https://www.youtube.com/watch?v=w5ebcowAJD8

indrora · 6 months ago

I'm reminded of part of (my family) history where I discovered that my grandfather had worked on the UPC system.

The original "test blanks" for UPCs were steel plates with machined cuts in them, photoreduced down to small sizes for testing.

BrandoElFollito · 6 months ago

A good channel overall. Some things are better, some other worse but for instance, his explanation of the Bayes theorem is on par with the one from 3blue1brown

speps · 6 months ago

“a guy” aka Veritasium, the channel with 17M subs

sejje · 6 months ago

What are you saying? He's so popular he cannot be referenced with a simple pronoun?

Sorry, meant no disrespect. I am assuming he's a guy; apologies if I'm wrong. :)

thequux · 6 months ago

One of the things that Data Matrix got right was being able to shift between encoding regimes mid-stream. Many character sets can be represented in radix-40 (so three characters per two bytes), and the occasional capital character can be handled by a shift byte. If you have a long string of digits, they can be encoded in 4 bits/char. You can even put raw binary in there if need be

A QR Code consists of a sequence of segments. Each segment has a mode - numeric, alphanumeric, kanji, or byte. It is possible to shift between encoding regimes by ending a segment and beginning a new segment with a different mode. https://www.nayuki.io/page/optimal-text-segmentation-for-qr-...

Some 1D barcodes have inline shift symbols like you said for Data Matrix, though. e.g. https://en.wikipedia.org/wiki/Code_128

I was going to reply to point that out. I'm not surprised you're the one to point it out first!

silotis · 6 months ago

I seems to me best approach would be to compress the contents with a Huffman code or some other entropy encoding. All this business of restricted character sets is just an ad-hoc way of reducing the size of each symbol and we've got much more mature solutions for that.

account42 · 6 months ago

For entropy codes to be effective for such short strings you need a shared initial probability table. And if you have that you are effectively back at special encoding modes for each character set.

franciscop · 6 months ago

Another level of "Why": QR codes were invented in 1994 in Japan for automated scanning. As you all probably know, Japanese uses Kanji, and hiragana+katakana; it does not rely on the Latin alphabet. However, Japanese speakers are familiar with it and occasionally use it for specific purposes. While katakana is commonly used for transliterating foreign words, sometimes the exact original spelling is preserved for stylistic reasons, especially in acronyms or single-word cases.

However, in such cases, they usually use only capital letters. In Japanese, there's no distinction between lowercase and uppercase letters. So for them this distinction is, if you allow some leeway, similar to the differences between normal, italic, or bold letters in English. So it makes sense with this context, if you are making a "group of letters and numbers", to default to uppercase + numbers as the normal/shorter version.