gnyman's avatargnyman's Twitter Archive—№ 1,422

  1. While reading the chapter on codes in "If it's secure it's vulnerable" by @mikko I started thinking about this again. Would the whole chapter fit inside a QR code with ALPHANUMERIC? This became quite a rabbit hole... @gnyman/1561821886316773376
    1. …in reply to @gnyman
      Turns out that, no. The chapter is ~4700 characters. So ~400 too many. AND ALPHANUMERIC IS/NT VERY READABLE. NO NEWLINES AND ONLY UPPERCASE AND $%*+-./: ALLOWED
      1. …in reply to @gnyman
        But wait. What about compression? Yes! Even DEFLATE can do it, and there is a BASE54 encoding specifically for ALPHANUMERIC QR codes. Now the whole chapter fits in ~3500 chars (or 3400 with bzip2). (🎩 to gchq.github.io/)
        oh my god twitter doesn’t include alt text from images in their API
        1. …in reply to @gnyman
          And actually... the BASE54 is unnecessary. We can store binary directly in QR. A whopping 23648 bytes!? (~23 KiB) Wow I wonder how much the whole book compresses. Can I fit it all in it?
          oh my god twitter doesn’t include alt text from images in their API
          1. …in reply to @gnyman
            [[ Edit: It's actually not 23648 bytes, it's bits. So ~3 KiB and it obviously won't fit. But if I knew that I wouldn't have continued down the rabbit hole so let's just continue seeing how much compressed text we can fit into a QR code ]]
            1. …in reply to @gnyman
              No.. The full text compresses to roughly 111 KiB of bzip2, which is ~̶𝟻̶𝚡̶ too much. Hmm, I wonder what more modern algorithms can do? Let's try zstd, and brotli. No... actually turned out bigger! (123 and 129 KiB). Is there anything else out there?
              1. …in reply to @gnyman
                Yes! Turns out there is at least two long long running competitions for compressing pure text as much as possible with little regard for speed or resource usage. First mattmahoney.net/dc/text.html by @mattmahoneyfl Second prize.hutter1.net/ by @mhutter42
                1. …in reply to @gnyman
                  So let's try the second best one cmix. Ok, wow... that was slow, it took 10 minutes (vs <1s for bzip2 --best) And gave us 88 KiB! That's nice but not enough. We'd need ~30 QR codes (with the correct ~2.9 KiB per QR), which is actually not bad.
                  1. …in reply to @gnyman
                    But if I wanted to print it as a code, I bet there are better ways? Yes! @martinmonperrus has written a great overview here monperrus.net/martin/store-data-paper I could use OPTAR ronja.twibright.com/optar/or JABCode
                    oh my god twitter doesn’t include alt text from images in their APIoh my god twitter doesn’t include alt text from images in their API
                    1. …in reply to @gnyman
                      OPTAR can apparently store ~200 KiB of data per a4 page, so the whole compressed "If it's smart, It's vulnerable" (~120 to 88 KiB) would fit just fine on one page. JABCode I'm not sure, it seems 4,6 KiB per "symbol" (square) but you can have more than one symbol.
                      1. …in reply to @gnyman
                        Either way, JAB Code seems interesting. Seems it was developed by @FraunhoferSIT and nowadays a ISO Standard 23634:2022 If you want to read the details without paying, the BSI doesn't paywall their standards bsi.bund.de/SharedDocs/Downloads/EN/BSI/Publications/TechGuidelines/TR03137/BSI-TR-03137_Part2.pdf [FIN]