-
While reading the chapter on codes in "If it's secure it's vulnerable" by @mikko I started thinking about this again. Would the whole chapter fit inside a QR code with ALPHANUMERIC? This became quite a rabbit hole... @gnyman/1561821886316773376
-
Turns out that, no. The chapter is ~4700 characters. So ~400 too many. AND ALPHANUMERIC IS/NT VERY READABLE. NO NEWLINES AND ONLY UPPERCASE AND $%*+-./: ALLOWED
-
But wait. What about compression? Yes! Even DEFLATE can do it, and there is a BASE54 encoding specifically for ALPHANUMERIC QR codes. Now the whole chapter fits in ~3500 chars (or 3400 with bzip2). (🎩 to gchq.github.io/)
-
[[ Edit: It's actually not 23648 bytes, it's bits. So ~3 KiB and it obviously won't fit. But if I knew that I wouldn't have continued down the rabbit hole so let's just continue seeing how much compressed text we can fit into a QR code ]]
-
No.. The full text compresses to roughly 111 KiB of bzip2, which is ~̶𝟻̶𝚡̶ too much. Hmm, I wonder what more modern algorithms can do? Let's try zstd, and brotli. No... actually turned out bigger! (123 and 129 KiB). Is there anything else out there?
-
Yes! Turns out there is at least two long long running competitions for compressing pure text as much as possible with little regard for speed or resource usage. First mattmahoney.net/dc/text.html by @mattmahoneyfl Second prize.hutter1.net/ by @mhutter42
-
So let's try the second best one cmix. Ok, wow... that was slow, it took 10 minutes (vs <1s for bzip2 --best) And gave us 88 KiB! That's nice but not enough. We'd need ~30 QR codes (with the correct ~2.9 KiB per QR), which is actually not bad.
-
But if I wanted to print it as a code, I bet there are better ways? Yes! @martinmonperrus has written a great overview here monperrus.net/martin/store-data-paper I could use OPTAR ronja.twibright.com/optar/or JABCode
-
OPTAR can apparently store ~200 KiB of data per a4 page, so the whole compressed "If it's smart, It's vulnerable" (~120 to 88 KiB) would fit just fine on one page. JABCode I'm not sure, it seems 4,6 KiB per "symbol" (square) but you can have more than one symbol.
-
Either way, JAB Code seems interesting. Seems it was developed by @FraunhoferSIT and nowadays a ISO Standard 23634:2022 If you want to read the details without paying, the BSI doesn't paywall their standards bsi.bund.de/SharedDocs/Downloads/EN/BSI/Publications/TechGuidelines/TR03137/BSI-TR-03137_Part2.pdf [FIN]