Wiggles, part 15


Just a little ongoing story to give you something to play with until the next blog post.

G COKN RVAVFNVZ BIKVSY GP VPXSGKL, FPZ BJDVZ GP F SGNNSV QSJKVR. NLV XOI HFK JUDGJOKSI NRIGPX NJ ZVQGZV HLVNLVR NJ UV QJPYOKVZ JR FPXRI, FPZ LV SVFPVZ GP F UGN NJ YJQOK UVNNVR JP BV. HLVP LV RVFSGEVZ NLFN LV HFK KNFRGPX GPNJ BI QLVKN, LV NJJT F KNVA UFQT NJ NRI NJ AVVR OA FN BI YFQV. UVQFOKV G HFK NJHVRGPX FSBJKN F NLGRZ JY F BVNVR JDVR LGB (12”), LV LFZ NJ NFTV F KVQJPZ KNVA UFQT. G NJJT F KNVA YJRHFRZ, FPZ NLGK NORPVZ GPNJ F TGPZ JY YOPPI SGNNSV ZFPQV, OPNGS G XJN LGB NJ NLV ZJJR. NLFN HFK JAVP NJ SVN JON KJBV JY NLV RVVT, FPZ UI NLV NGBV BI ZFPQV AFRNPVR RVFSGEVZ LV HFKP’N HGNLGP FRB’K ZGKNFPQV JY LGK UVVR, LV YGXORVZ NLFN LV’Z UV QJPYOKVZ. FPZ, HLVP IJO’RV QJPYOKVZ, NLV UVKN ASFQV NJ UV GK NLV NRFGP KNFNGJP, UVQFOKV IJO QFP UV QJPYOKVZ JP F PGQV, QJJS AFNQL JY YSJJR NGSV, FPZ AFKK JON OPNGS NLV PVWN NRFGP FRRGDVZ. LV KAOP FRJOPZ, KNOBUSVZ FHFI F YVH KNVAK, NLVP AOTVZ FSS JDVR NLV KGZVHFST. COKN FPJNLVR YRGZFI PGXLN.

Advertisements

Thinking About Encryption, Part 18


Just when I think I’m out, they drag me right back in…

The Gronsfeld cipher is named after a Count Gronsfeld (I can’t find anything for specifically which Count), and is a variant of the Vigenere cipher. The only real difference is that the Gronsfeld key is a string of single digits between 0 and 9. The advantage that the key is not a human-readable word (or string of words) is completely wiped out by the fact that it only uses 10 alphabets (0-9) (compared to Vigenere’s 26 (A-Z)). You attack Gronsfeld the same way you would Vigenere, either by determining the key length, or by “key elimination” (AKA: a “probable word” approach against the plaintext).

I encountered Gronsfeld accidentally while reading a Signal Corps Bulletin story written by the famed William Frederick Friedman (1891-1969), first head of the American Signal Intelligence Service (SIS). He began the division with three “junior cryptanalysts” in April 1930 – Frank Rowlett, Abraham Sinkov, and Solomon Kullback. Although the SIS was a secret branch of the U.S. Army Signal Corps, Friedman published Edgar Allen Poe, Cryptographer in American Literature, vol. VIII, no. 3, November 1936, and it was reprinted in issue no. 97 of the Signal Corps Bulletin (for July to September 1937). He followed this up with Jules Verne as Cryptographer, in issue no. 108 (April-June 1940). (These bulletins have been declassified.)

I discovered all this as I was writing up the blog entry on famous substitution ciphers in fiction. Specifically, I was researching Jules Verne’s Journey to the Center of the Earth cipher, and had found a link mentioning that Verne had used ciphers in other works as well. In trying to follow this up, I reached Cipher Mysteries, which had a link to a section of Bulletin no. 108, with Friedman’s essay on Verne. This essay covered Journey (1864), The Giant Raft (1881) and Mathias Sandorf (1885). This got me interested in the Signal Corp bulletins, and I pretty quickly found the essay on Poe in SCB 97.

The Signal Corps Bulletins as a whole are absolutely fascinating, historically-speaking. They include technical information, training manuals, casual information, people movements, fiction, and articles showing a growing understanding of the importance of ciphers in the Corps. No. 97 also has an article on “Cipher Busting in the Seventh Corps Area” by Col. Stanley L. James, and on “Analysis Versus the Probable Word” by Howell C. Brown. No. 108 has a piece on “Transpositions” by W.C. Babcock, but that’s missing from the excerpt PDF I’ve found. The problem is being able to find more than 2-3 bulletins online that have articles on ciphers. Fortunately, I did locate the nsa.gov site for declassified papers, specifically the William Friedman Collection. This got me to Articles on Cryptography and Cryptanalysis from the Signal Corp Bulletin. This is a 300-page PDF that collects close to 30 articles, both written by Friedman and others, including the Poe and Verne articles, as well as an addendum on the Poe article that appeared in Bulletin 94.

I’ve already covered the runic cipher in Journey to the Center of the Earth, but in his article Friedman goes into much greater detail on the accuracy of Verne’s descriptions of the attack on the cipher, and whether Verne himself really understood what he was talking about (general consensus – Verne was a half-decent amateur).

In Mathias Sandorf, the cipher uses what Friedman calls a rotating grille. This is a square ruled into a grid, with a few holes punched out. If the cipher text is grouped into 6 rows of 6 letters each, the grid needs to be sized so that it covers exactly the 6×6 text. First, you place the grille over the text with the “up” (north) edge pointing up, and you write out the letters that appear in the holes, left to right top to bottom. You then rotate the grille 90 degrees (right or left, depending on the algorithm), and write out the next set of letters. Do two more rotations, and you have the full message (you encrypt the message the same way). Any sized grids can be used as long as they’re square; even-numbered sides will use all the cells of the grid; odd-numbered sides will leave one cell covered (you can use rectangular grids, but the ways you can rotate them are more limited). With Verne’s cipher, the grid is 6×6, for 36 cells. One-quarter of the cells (9) have holes specifically chosen to expose different letters with each rotation.

The ciphertext, taken from wikisource.org is:

ihnalz zaemen ruiopn
arnuro trvree mtqssl
odxhnp estlev eeuart
aeeeil ennios noupvg
spesdr erssur ouitse
eedgnc toeedt artuee

You’ll have to massage it a bit to make the text square. OR, just write it out on graph paper, one 6×6 block at a time.

The text as written out from the grille is reversed, and needs to be re-reversed to read: “‘·Tout est pret. Au premier signal que vous nous enverrex de Trieste, tons se le•erout en masse pour l’independance de la Hongrie. Xrzah.” Verne has his main character claim that the last 5 characters are a “conventional signature,” when they’re actually nulls to pad out the block. According to Friedman, Verne makes a few assumptions in the deciphering of this message that a professional cryptographer would have never made. But, still, Verne’s doing better than Poe.

Ok, this gets me to The Giant Raft. In this story, the hero, Joam Dacosta, AKA Joam Garral, is accused of committing a murder. A letter is sent proving that Joam is innocent, but it’s encrypted and no one has the key. The Judge on the case, Jarriquez, sets out to break the cipher in 8 days before Joam is to be executed. A large part of the story consists of Jarriquez’ (mostly failed) attempts at this. The last paragraph of the letter reads:

Phyjslyddqfdzxgasgzzqqehxgkfndrxujugiocytdxvksbxhhuypohdvyrymhuhpuydkjoxphetozsletnpmvffovpdpajxhyynojyggaymeqynfuqlnmvlyfgsuzmqiztlbqgyugsqeubvnreredgruzblrmxyuhqhpzdrrgcrohepqxufivvrplphonthvddqfhqsntzhhhnfepmqkyuuexktogzgkyuumfvijdqdpzjqsykrplxhxqrymvklohhhotozvdksppsuvjhd.

The Judge eventually decides that this is a Gronsfeld cipher. Verne himself was convinced that Gronsfeld was impossible to decipher without the key, or at least without a significant amount of intelligence and luck. What’s funny is that in the same issue of the Signal Corps bulletin as Friedman’s article, is Howell Brown’s “Analysis Versus the Probable Word,” which specifically demonstrates how to attack a short Vigenere cipher (when attacking the key won’t work) using a “probable word.”

Howell’s example uses the ciphertext:
“YGFAT NZAQS CAAAX QSGGO EZAGP RYAXX”

His approach is to write the word he thinks may be in the plaintext (the “probable word”) vertically at the left of the message, and apply that to the cipher to see what key pops out. If the probable word is correct, the first letter of the word would be the first letter to appear in the ciphertext, the second letter would be the second in the cipher, etc., making for a keyword that reads diagonally top left to bottom right. Howell uses “BEARER” as his probable word.

- YGFAT NZAQS CAAAX QSGGO EZAGP RYAXX
B ZHGBV OABRT DBBBY RTHHP FABHQ SZBYY
E -KJEX RDEUW GEEEB UWKKS IDEKT VCEAA
A --FAT NZAQS CAAAX QSGGO EZAGP RYAXX
R ---RK EQRHJ TRRRS HJXXF VQRXG IPROO
E ----X RDEUW GEEEB UWKKS IDEKT VCEAA
R ----- EQRHJ TRRRS HJXXF VQRXG IPROO

The advantage of this approach is that it’s easier than writing “BEARER” on a separate piece of paper and sliding it under the ciphertext and checking the key values individually. It’s a bit difficult to read, but starting at the 8th letter, “BUSTER” is spelled out diagonally, and this is the most promising key to try applying to the rest of the ciphertext, because it is the only thing in English here.

To make the key easier to read, delete the leading nulls for lines 2-6, and reformat:

ZHGBV OA B RT DBBBY RTHHP FABHQ SZBYY
KJEXR DE U WG EEEBU WKKSI DEKTV CEAA
FATNZ AQ S CA AAXQS GGOEZ AGPRY AXX
RKEQR HJ T RR RSHJX XFVQR XGIPR OO
XRDEU WG E EE BUWKK SIDEK TVCEA A
EQRHJ TR R RS HJXXF VQRXG IPROO

Reversing the approach, Howell writes out the ciphertext in rows of 7 letters each:

BUSTER?
-------
YGFATNZ
AQSCAAA
XQSGGOE
ZAGPRYA

And gets

BUSTER?
-------
YGFATNZ
dontle?
AQSCAAA
bearer?
XQSGGOE
eeanyd?
ZAGPRYA
cument?

Assuming the first line is “don’t let,” the key becomes “BUSTERS” and the message is: “don’t let bearer see any documents.”

Why is this useful? Well, with Gronsfeld, the key is made up of the digits 0-9, and the Judge in the story thinks that the message contains either the name of the suspect, or the name of the sender of the message. Verne makes the wrong assumption that the suspect’s name (Dacosta) is only allowed to appear at the very beginning, or the very end of the text, and the author’s name (Ortega) is not revealed to the readers until near the end of the story. If he had used Howell’s method on the probable word “Dacosta”, the idea would be to accept only diagonal numbers where the difference between the word and the cipher text is between 0 and 9. (To get started, P – D = 16 – 4 = 12. This is too big and gets ignored. Also, because the alphabet wraps at “Z” – Z + 3 = C – negative numbers between -1 and -9 are also allowed, with the sign removed.)

- PHYJSLYDDQFDZXG
D -4-6-8-00-20--3
A -------44-54--7
C ---7-8-11-31--4
O ----4----2---9-
S ------6-----75-
T ------5-----64-
A -------44-54--7

Pulling out the leading nulls for lines 2-6 shows that none of the columns consist of only 0-9.

-4-6-8-00-20--3
------44-54--7
-7-8-11-31--4
-4----2---9-
--6-----75-
-5-----64-
-44-54--7

There are two things to note here – first is that there is no diagonal line that is made up only of the digits 0-9. Second, we’re only trying this example on a very small part of the beginning of the cipher. If we (and Verne) were to continue in this way through the entire cipher message, we’d crack the key a little more than halfway in. Using a computer, this would be trivial, but it is kind of easy to make a manual mistake here, or to overlook the correct diagonal string.

This is where Friedman’s attack on the key length makes more sense (and in fact, Howell also advocates attacking the key length on longer ciphertexts). We look for 2-, 3- and/or 4-character groupings that appear more than once in the message.

Phyjslyddqfdzxgasgzzqqehxgkfndrxujugiocytdxvksbxhhuypohdvyrymhuhpuydkjoxphetozsletnpmvffovpdpajxhyynojyggaymeqynfuqlnmvlyfgsuzmqiztlbqgyugsqeubvnreredgruzblrmxyuhqhpzdrrgcrohepqxufivvrplphonthvddqfhqsntzhhhnfepmqkyuuexktogzgkyuumfvijdqdpzjqsykrplxhxqrymvklohhhotozvdksppsuvjhd

Friedman focuses only on 3- and 4-letter groupings:
DDQF (twice, 186 letters apart)
KYUU (twice, 12 letters apart)
HHH (twice, 54 letters apart)
RYM (twice, 192 letters apart)
RPL (twice, 60 letters apart)
TOZ (twice, 186 letters apart)

To be pedantic, what we’re looking for are the factors of these spacings that are common between all groupings. We have to keep in mind, though, that there may be “accidental hits” that are purely coincidental and must be ignored. Note that DDQF and TOZ are both 186 letters apart, so we only need to consider 186 once.

186 – 2, 3, 6, 31, 62, 93
12 – 2, 3, 4, 6
54 – 2, 3, 6, 9, 18, 27
192 – 2, 3, 4, 6, 8, 12, 16, 24, 32, 48, 64, 96
60 – 2, 3, 4, 5, 6, 10, 12, 15, 20, 30

Of all these factors, only 2, 3 and 6 are common across all groupings (i.e. – there are no accidental hits). But, keys of lengths 2 or 3 are too short to be secure, so we can assume that the key length must be 6 digits long.

The next step then is to group the cipher text, to effectively create 6 alphabets. That is, we put letters 1, 7, 13, 19… together for the first row. 2, 8, 14, 20… for the second row. 3, 9, 15, 21… for the third row, etc.

PYZZXRIXHHMYPSMPHYEQYMBSNGRQREIPVQHMEZMQSXMHVS
HDXZGXOVHDHDHLVDYGQLFQQQRRMHGPVHDSHQXGFDYHNHDU
YDGQKUCKUVUKEEFPYGYNGIGEEUXPCQVODNNKKKVPKXKOKV
JQAQFJYSYYHJTTFANANMSZYURZYZRXRNQTFYTYIZRQLTSJ
SFSENUTBPRPOONOJOYFVUTUBEBUDOUPTFZEUOUJJPROOPH
LDGHDGDXOYUXZPVXJMULZLGVDLHRHFLHHHPUGUDQLYHZPD

The purpose of this step is to simply get the letter frequencies of each grouping together. But, rather than sorting the letters from most frequent to least, we want to keep them in alphabetical order for the French alphabet. The reason for this is that the Vigenere and Gronsfeld ciphers are simple Caesar shift ciphers. That is, each grouping is just slid a fixed number of letters to the left. So, if we match up the distributions for each group one at a time against the normal plaintext distributions, we can immediately determine how much each group was shifted, which gives us the key. (Keep in mind there’s no “W” in the French alphabet.)

This blog entry is getting too long, so I’ll just show the example for group 2, using the plaintext distribution that Friedman provides for a typical 50-letter message.

It’s pretty clear that group 2 is shifted three positions to the right, giving us a key of _3____.

Doing the same thing for the other groupings, if you do check out Friedman’s article, gives you the key 432513.

Going back to Verne’s story, once the Judge learns that the name of the author of the letter is “Ortega,” he tries applying it to the first 6 letters of the ciphertext, then the last 6.

PHYJSL
ORTEGA
1-45--

SUVJHD
ORTEGA
432513

Presuming that Ortega signed the letter at the end, and that he now has the key, the Judge proceeds to decipher the entire letter, and Dacosta is saved at the final minute. Friedman’s contention is that if Jules Verne really did understand the cipher he’d written his story around, the story would have been at least 50% shorter.

To be honest, I have difficulty in remembering that Vigenere (and the Gronsfeld variant) are just simple shifts of the entire alphabet when you’re dealing with short keys. Friedman’s demonstration of how to attack the key and then obtain the amount of shift for each alphabet makes things a lot easier for me to understand. But, the point of Howell’s article is that if the ciphertext is short, then you have no choice but to use the probable word approach.

Summary:
1) Gronsfeld uses the digits 0-9 for the key.
2) It’s considered a bit more difficult because the key is not a human-readable word.
3) Because it only uses 10 alphabets, Gronsfeld is actually easier to break if you find the key length.
4) The “probable word” approach demonstrated by Howell works for both Vigenere and Gronsfeld when you have shorter ciphertext messages.
5) Once you have the key length, comparing the letter frequencies of each alphabet group to the letter distributions of the plaintext language will help in giving you the shift value for each group.
6) In cases where a given group doesn’t have a clear letter distribution, you can try applying as much of the key as you already have to the ciphertext in order to guess at different words in the plaintext, and get the rest of the key that way.
7) Signal Corps bulletins rule!

Wiggles, part 14


Just a little ongoing story to give you something to play with until the next blog post.

“C’M KXQQW, KCQ. WXB’SF YPD AXX MBOY AX DQCNG. WXB IFAAFQ IF RXCNR NXV.” AYF DQBNG HBKA KNPQZFD PA MF PND DBR CN YCK YFFZK. CA VPK P LQCDPW NCRYA, PND AYF IPQ C’D LXBND VPKN’A PZZ AYPA UPOGFD, IBA CA VPK RFAACNR CNAX MCD-KBMMFQ PND AYF PCQ OXNDCACXNCNR DCDN’A VXQG. PZKX, AYF UZPOF VPK APOYC-NXMC (KAPND-PND-DQCNG), VYCOY YPD UFXUZF KAPNDCNR P ZCAAZF OZXKFQ AX FPOY XAYFQ AYPN BKBPZ, PND AYF YFPA, YBMCDCAW PND QFFG OXMICNFD AX OXMUFZ UFXUZF CNAX XQDFQCNR ZXAK XL IFFQ, PND AYFQFLXQF RFAACNR P ZXA DQBNGFQ AYPN VPK RXXD LXQ AYFM. XNF XL AYF OZCFNAFZF AYCK FSFNCNR VPK P KPZPQWMPN (P DFKG-IXBND VYCAF OXZZPQ UPUFQ UBKYFQ) CN YCK LXQACFK, AQWCNR AX IZXV XLL P ZXA XL KAFPM PLAFQ P IPD DPW XL VXQG. CN YCK OPKF, AYCK AXXG AYF LXQM XL P UCAOYFQ XL IFFQ XN PN FMUAW KAXMPOY, PND AYFN AQWCNR AX RQXUF XNF XL AYF WXBNRFQ XZK (XLLCOF ZPDCFK, IPKCOPZZW P MCNCMBM VPRF-ZFSFZ KFOQFAPQW KXMFVYFQF). “NPNC KXQF, IPGP WPQXB. GXGX VP NCYXN, NCYXN-RX VX CB!” (“VYPA VPK AYPA, WXB IPKAPQD. AYCK CK HPUPN, KUFPG HPUPNFKF!”) YF KNPQZFD, VYCZF AQWCNR AX GFFU YCK YFPD BU.

Board Cat Kit


I was on a business trip to Osaka recently, and when I arrived at the airport on the way back home, I had a few hours to kill. The gift shop in the main lobby had a number of little laser-cut plywood kits for sale, and I figured I might as well buy one and see what it’s like to build. There’s a large variety to the kits, from animals to musical instruments (a piano, cello and guitar) to big $60 units for making Himeji Castle and a 2′-tall Ferris wheel. While I was tempted to go big, I have no place to keep finished kits like that, so I settled on the Sitting Cat for 1,000 yen ($9 USD.) (I also had a bowl of ice cream for dinner.)

The kit comes in a flat envelope, which includes 2 sheets of thin, pre-cut plywood, and the instruction sheet. The instructions are pictorial only, but still pretty easy to follow. The pieces have to be punched out, and that was probably the most time-consuming part. They do stick in the main form, and you have to be careful because they will break. What may have helped the most might have been if I’d had a cutter knife, and just removed bits of the main form to make taking the pieces out easier.

The pieces are interlocking and force-fit, so you don’t need glue. This is also good since I had to take everything apart a few times because I got the pieces in the wrong sequence. I didn’t see a suggested assembly time, and I wasn’t really paying attention to when I started. I did have a specific deadline, in that I wanted to get past the security checkpoint shortly after check-in opened up (the airline here didn’t allow check-in until 90 minutes before boarding). Either way, I think I took 90 minutes total. If I ever make another one of these, I know I’ll be a lot faster now that I understand what I’m doing.

There’s a 1″ wide strip along the length of one of the sheets that contains an entire backup collection of small pieces that are most likely to break. This was a lifesaver, because one piece shattered as I was trying to punch it out, and a second piece broke as I was trying to push it into place on the main assembly. I needed those backups.

I waited until I got home to take the last two photos of the completed cat, because the lighting is better here than in the restaurant. Overall, it was fun, although a bit frustrating, to build. Next time, I’d want sandpaper to open up some of the notches to make the pieces fit together a little more smoothly.

Most cats are board. This one just doesn’t bother to hide it.

Thinking About Encryption, Part 17


I keep coming back to Vigenere, but the reason for it this time is that I was finally struck by a weakness in using the Running Key cipher. This is related to the use of a word or shorter phrase as the key for a plain Vigenere cipher. Generally, when experts say that Vigenere is unbreakable, they’re talking about using a random string of letters (or numbers 1-26 to represent letters) to create a key longer than the plaintext. But, this is impossible to remember. You need to record the random key on paper somewhere, and use it as a one-time pad.

The older approach to Vigenere was to use one long word, or a short phrase for the key, but this is vulnerable to cycle counting (looking for repeating strings of cipher text that have cycles that are factors of the key length). Instead, we can use the Running Key, which as mentioned in the last entry, is a text string taken from a book used as the source text. The advantage of this approach is that if your book is longer than your plaintext message, and you use a given key string once and only once, it acts like a one-time pad that is easier to remember and/or generate.

But, by using a key string made up of human-readable text, you add a level of predictability to the key that you don’t have with a random collection of characters, allowing someone else to attack the key instead of the ciphertext.

Say you have the following cipher, and you have reason to believe it’s a Vigenere. You run a frequency count on it and there’s no obvious substitution letter distribution (it doesn’t follow the ETOAIN SHURDCLU distribution) and you’re sure the plaintext was in English. There’s also no repeating collections of cipher characters you could use to determine key length. So, you guess that either the plaintext is too short for character collections to occur, or that the sender used a running key. But, there’s one more thing that you think may be likely – and that is that the plaintext might contain the words “mountain,” “submarine,” or “ocean.”

IOAKASTDERGPQQXHVQOLGAABAGRKWAFLEEKSMBOCFUCEYQJH

The idea now is that we apply each of these words, one at a time, across the full length of the ciphertext to see what we get out. And, just to play it safe, we try to decipher the text with these words as a partial key.

Now, just to save myself some work, none of the above three keywords appear in the key string, and the results of trying to decrypt the above cipher text just results in garbage.

For example, applying the key “MOUNTAIN” to the first 8 letters gives:
IOAKASTD – message
WAGYHSLQ – proposed plaintext

Running “MOUNTAIN,” “SUBMARINE,” and “OCEAN” along the entire key string gives us equally unreadable “plaintexts”.

Then, say we run “MOUNTAIN” along the ciphertext to see what keys could have generated this cipher.

Ciphertext – Reversed key
IOAKASTD – WAGYHSLQ
OAKASTDE – CMQNZTVR
AKASTDER – OWGFADWE
KASTDERG – YMYGKEJT
ASTDERGP – OEZQLRYC
STDERGPQ – GFJRYGHD
TDERGPQQ – HPKEKPID
DERGPQQX – RQXTWQIK
ERGPQQXH – SDMCXQPU
RGPQQXHV – FSVDXXZI
GPQQXHVQ – UBWDEHND
PQQXHVQO – DCWKOVIB
QQXHVQOL – ECDUCQGY
QXHVQOLG – EJNIXODT
XHVQOLGA – LTBDVLYN
HVQOLGAA – VHWBSGSN
VQOLGAAB – JCUYNASO
QOLGAABA – EARTHATN

This last bit is kind of promising, in that it looks more like English than any of the other strings do. So, maybe “mountain” does exist in the plaintext 17 characters in. If so, characters 17 to 25 can be skipped in future checks. We go to “ocean,” and character positions 1-16 result in garbage. Starting at position 26:

Ciphertext – Reversed key
GRKWA – SPGWN
RKWAF – DISAS

Everything else is garbage. Note that the hit here starts at position 27, leaving one letter of plaintext between “mountain” and “ocean.” We can reasonably guess that maybe the plaintext is actually “mountains”, so let’s test that again:

Ciphertext – Reversed key
QOLGAABAGS – EARTHATNO

QOLGAABAGSRKWAF – EARTHATNODISAS

We could stop here, but let’s push a bit farther for the sake of experiment. Switching to “submarine”, positions 1-16 still just give us garbage. Starting at 31,

Ciphertext – Reversed key
LEEKSMBOC – TKDYSVTBY
EEKSMBOCF – MKJGMKGPB
EKSMBOCFU – MQRABXUSQ
KSMBOCFUC – SYLPOLXHY
SMBOCFUCE – ASACCOMPA

If we put the pieces together again, we get:

QOLGAABAGSRKWAFLEEKSMBOCFUCE – EARTHATNODISAS____ASACCOMPA

Proposed plaintext (positions 17-43):
mountainsoceanXXXXsubmarine

Making kind of a leap, let’s guess that “XXXX” is “s and”:

LEEK – TERH

Gives us “EARTHATNODISASTERHASACCOMPA

Doing a google search on “that no disaster has” gives us the first paragraph of Mary Shelly’s “Frankenstein” – “You will rejoice to hear that no disaster has accompanied the commencement of an enterprise which you have regarded with such evil forebodings.”

Using this as the running key, we get the plaintext:
“KAGOSHIMA IS HOME TO MOUNTAINS, OCEANS, AND SUBMARINE LIFE.”

This may be kind of contrived, but it does show how vulnerable the running key cipher is to attacks on the key. If the cipher book used for the key is online, and you choose the right plaintext words, the book will eventually show up in an internet search and the cipher will fall. And, the longer the plaintext, the greater the vulnerability.

The easiest way to harden this cipher is to do a simple transposition. Break up the word patterns with Scytale or rail fence, or reverse every other line of the main text. You can use the word lengths of the first three or four words of the running key text for the transposition keys, if you like.

What’s interesting here is that the strength of Running Key is that it’s easy to remember and implement, yet it’s weakness is that because it’s easy to remember, it has a predictability and set of rules that allows it to be exploited. Which means, we either employ a convenient, broken system, or an inconvenient, impossible to break algorithm.

Which brings me to the concept of “randomness.” One of the most common themes that I’ve encountered regarding ciphers, electronics and software in general is that there’s really no such thing as “a truly random number generator,” except for maybe pure noise (i.e – a floating, detuned antenna), or a cosmic ray detector. This is particularly important for ciphers when we talk about one-time pads. If we look at the running key, we can see patterns. Grammatical patterns, word-level patterns, and letter-level patterns (repeating letter combinations, or combinations that never occur with normal English words, like “qurzctl”). To make Vigenere truly unbreakable we need a truly random key. However, software random number generators are actually “pseudo-random.” In the old days, they always started with the same numerical sequence, which could be repeated whenever you turned the computer on. You could shake things up a bit by using a “seed,” which was a selectable value for creating a sequence with a different starting point. But, unless the seed was the system time and date, the sequence could be replicable across PCs. Modern PC languages do use more “random” sequences for their generators, but there is still a possibility that the generator (a mathematical algorithm) can get into a predictable sequence.

Where this matters to us, is when we have a long plaintext message, and we’re generating a random Vigenere key string. If the generator has a repeatable pattern that takes the form of ASCII characters “A-Z,” then theortically, we can use that pattern sequence in the same way we did with “submarine” and “mountain” in breaking the running key cipher, by sliding the sequence (say, “ABBQRT”) across the ciphertext to attack the key. If human-readable text appears in our predicted plaintext, such as “defenestr,” we can try guessing at the rest of the plain text (“defenestration”) and work back and forth between attacking the plaintext and attacking the key.

Does this really matter? I’m not sure. My gut reaction is that software random number generators are random enough that even if you encounter a predictable character string, chaos theory will bite you eventually, in that if you don’t know the exact algorithm and the exact starting seed, the sequence will veer off into the unknown very quickly and you’ll be back where you started, unable to tell if you really did crack part of the key, or if you’re lying to yourself. On the other hand, I’m not an expert in random generator algorithms, and I don’t know what is, or isn’t, bad about them. When in doubt, randomize, randomize, randomize.

Finally, one other method for hardening Vigenere is to create a random distribution of letters for the first line of the Vigenere table (say, “BADCFEHGJILKNMPORQTSVUXWZY”) and Caesar-shift that by one letter to the left for each subsequent row. Example:

BADCFEHGJILKNMPORQTSVUXWZY
ADCFEHGJILKNMPORQTSVUXWZYB
DCFEHGJILKNMPORQTSVUXWZYBA
CFEHGJILKNMPORQTSVUXWZYBAD etc.

If you want to make this easier to remember, we have the old “pseudo-random” Caesar shift idea, where we take a keyword (such as JULY-AUGUST), remove duplicated letters (JULYAGST) and fill in the rest of the string starting from the last letter of the keyword, wrapping when we hit “Z”. Like:

JULYAGSTVWXZBCDEFHIKMNOPQR
ULYAGSTVWXZBCDEFHIKMNOPQRJ
LYAGSTVWXZBCDEFHIKMNOPQRJU
YAGSTVWXZBCDEFHIKMNOPQRJUL etc.

One other change we need to implement here is that the key no longer refers to the first letter of the line in the table, but rather to the line number (A=line 1, B=line 2, C=line 3) etc.

From what I understand. If I’m wrong or off-base, someone, please correct me.

Wiggles, part 13


Just a little ongoing story to give you something to play with until the next blog post.

DYFWU LN JYP HXODZN. WTWPUYFW L QFYM HXGPAWN RXWLP NODGZXY MXWFWTWP RXW HPWBLRN AWR KYM, GFB RXWF ONWN RXWLP ZXYFWN RY COU WTWPURXLFA GR RXW NXYZN. L DWGF, MW ONW YOP WKWHRPYFLH MGKKWRN JYP WTWF RXW DYNR RPLTLGK NROJJ, KLQW COULFA G 100 UWF ZGHQ YJ JOONWF AOD (COCCKW AOD). RXW NWHYFB L AWR NYDW HGNX, L POF RY RXW FWGPWNR QYFCLFL (HYFTWFLWFHW NRYPW) GFB RPGFNJWP LR RY RXW ZXYFW. L BYF’R RXLFQ RXWPW’N G ZKGHW LF IGZGF RXGR NWKKN NROJJ RY HONRYDWPN – G PWNRGOPGFR, G CGP, G QYFCLFL YP G BWZGPRDWFR NRYPW – RXGR LNF’R WSOLZZWB MLRX G NODGZXY PWGBWP. NY, FY, FY KOHQU CLKKN YP HYLFN NGJWKU XLBBWF LF DU NXYW. LJ L FWWB RY HGKK DYD, L’TW AYR DU ZXYFW, GFB NXW HYOKB ZGUZGK DW DYFWU LJ L FWWBWB LR. GFB LJ DU ZXYFW WTWP NRYZZWB MYPQLFA? XWU, RXLN LN IGZGF. FY YFW QWWZN WKWHRPYFLHN KYFA WFYOAX JYP LR RY NRYZ MYPQLFA. NXLF-XGRNOCGL, CGCU!

Thinking About Encryption, Part 16


In the last post I mentioned the Beale Ciphers (AKA: the Beale Papers). So, I was planning on getting into book ciphers this time. But, the more I thought about them, the more they bothered me.

By definition, a book cipher is one in which the “key” is some section of text from an agreed upon book. Now, I have suggested that the keys for Vigenere ciphers could come from book or movie titles, or specific paragraphs from the book, but this is more in keeping with running key ciphers (more about those below). Instead, book ciphers generally involve counting words or letters in the text of the book, and can be thought of as being one of two types.

In the first type, you count the words in the book, and use the words corresponding to the numbers to build the ciphertext. For example, in this blog entry, word 1 is “In,” 2 is “the”, 3 is “last,” etc. Therefore, if the ciphertext is

8 16 17 23 21

Then the message is “Beale was planning this book.”

This method is more accurately a code rather than a cipher, and the text becomes a code book. The greatest weakness here is that your book needs to contain the words you plan on using. That is, if you are going to write home about troop movements and your code book is “Healthy Southern Cooking,” you’re in trouble. The thing is, you have to select a book that you would reasonably be expected to carry, and if your cover is that you’re a financial planner, having “Jane’s Book of Tanks” might draw unwanted suspicion towards you.

This is where the second type of book cipher comes in. It is a true substitution cipher, and because it allows for multiple values of all letters in the alphabet, it is homophonic. This time, we either use the first letter of each word, or we count all the letters in the words (starting from some page and/or paragraph either from the front or back of the book). Using just the first letters of each word from this blog, starting from the top, the message

13 10 3 6 11 18 24

would read “palmtop”.

Interestingly, I don’t have any words in the first paragraph that start with “e,” “d” or “n,” so my initial plan to spell out “cipher,” and “plaid plan” didn’t work out. I would need a much larger block of working text to get 10-15 words that start with “e”; and “x” and “z” might never show up in the book at all. One work around for “x” would be to treat “ex” (expect, exact) as substitutes.

We can get around this last problem by counting all the letters (I = 1, n = 2, t = 3, h = 4, e = 5), but this gets ugly fast, as we get into the 10,000’s before reaching chapter 2 of the book. That’s really not a problem, though, as we could be fine just using the first 3-4 pages of our book. In Arthur Conan Doyle’s The Valley of Fear, Doyle has his ciphermessage start with the page number and column of the cipher book where you’re supposed to start counting. I view this as a weakness in this kind of cipher. It’s going to be a gimme for any authority member that has a copy of the message, and a list of all of the books you’re carrying. It also illustrates part of why I dislike book ciphers as a practical way to exchange messages on a regular basis.

37 103 2 7 59 39 139 = run away

They’re slow! They’re bulky! They’re error-prone!

You really need to double-check the letter distributions of the book first. In my example from this blog entry, “u” only shows up twice, and “y” doesn’t appear until the end of the text. If the plan is to have multiple instances of the most common letters, your book needs to allow for that. But, that can be someone else’s problem since HQ or your handler may be the ones picking the book for you.

Let’s say that you’re a spy, and you’re in some other country. If you get searched, you don’t want to have a blatant code book on you, and whatever book you do have should be in keeping with your persona. But, what if the authorities look in the book? If there are numbers over each of the letters, or at the beginning of each word, that’s going to look suspicious. And, if you wrote up a table with the alphabet and strings of numbers representing each letter to save you the effort of numbering all the words every time you want to send a message, that’s going to be equally suspicious. This means that whenever you want to send an encrypted message back home, you’re going to have to recount all the letters in your book over and over again. It’s going to be easy to make a mistake, and you’ll spend a lot of time recounting the letters to make sure you didn’t slip up. You don’t want to do this all the time, and you certainly wouldn’t want to rely on a book cipher in an emergency when you only have a few minutes to write up your ciphertext before making an escape.

So, book ciphers for sending messages back home are a pain. But, it wouldn’t be so bad if HQ used the book cipher for sending instructions TO you. If you only receive messages occasionally, deciphering them wouldn’t be so bad and HQ could use software to do the letter counts to ensure that the numbers are correct. And, in this situation, HQ could assign different books to different agents, and keep the titles for each book in a file somewhere. In fact, it would be trivial to index each book once, build up a homophonic table for the entire alphabet for each book, and just use the correct table when encrypting messages to send to any given agent. Life would be easy for HQ, just not for the people in the field.

There’s another question here, though. If the idea is to avoid arousing the suspicions of the authorities by carrying an innocuous book with you, why are you expecting to get searched? Yes, there may be routine checkpoints where you may have to hand over your passport, or your papers, but having all of your belongings checked probably means that they suspect you already. In which case, you can expect that they’re going to record the titles of the books in your possession. But, that’s really not going to help anyone very much unless they’ve managed to intercept an encrypted message or five. Meaning that if they have a message, and they have you, any book you’re carrying will be used to try cracking said message. And, if it that works, and they still let you go, you know that they’re planning on encrypting their own messages to send to your HQ, too.

If we look at the wiki entry on book codes, we can see that they’re very popular in fiction. In real life, there’s a mention of Benedict Arnold’s Arnold cipher, and there’s something called Cicada 3301, which I’ve never heard of before. And, of course, there’s the Beale Papers.

One of the more common examples for books for use with the book cipher is the Bible, because it’s easily found anywhere, and Christians can be expected to carry a copy with them. Naturally, if the enemy intercepts a ciphertext that has page and paragraph numbers on it that kind of look like Bible passages, the first book they’re going to check is the Bible. But, this does suggest a solution to the above issue of the authorities looking at your belongings. If you don’t need the book for enciphering or deciphering messages every day, then don’t keep a copy with you. Pick a book that you can buy at any bookstore when you need it. Or, use the Gideon’s Bible left in the desk of your hotel room.

With a Bible cipher, you can use passage numbers to narrow down the text to a specific passage, and just count the letters in that passage. Another option is to use the page number, the line number on that page, and then the letter count value (i.e. – 3/14/20 = page 3, line 14, 20th letter in from the left). But, this weakens the cipher somewhat by making it look more like a book cipher.

If you haven’t heard of the Beale Papers, you can check the Wiki entry. Letter #2 is the only one that’s ever been publicly decrypted, and that used the Declaration of Independence. Even so, there are errors, and certain adjustments that need to be made to do the decryption. Can you imagine the amount of work necessary for counting three separate documents, up to 1000 words each, if letters #1 and #3 really do use different source books? Why was the Declaration so easily obtained by Beale, when the other two letters apparently use cipher keys that don’t involve any other book or document that could reasonably be expected at the time to be found at a bookstore, library, or in the possession of the innkeeper where Beale stayed? I’m leaning toward the theory that the Beale Papers are a hoax perpetrated to sell the pamphlets claiming to contain instructions for where to find the “Beale treasure.”

Additionally, if you look at the treasure with a critical eye, do you really think anyone in Beale’s party would have allowed him to leave by himself with entire wagons filled with gold and jewels? Through native American territory? I wouldn’t have. I would have stuck to those wagons like glue all the way back to Tennessee, probably along with everyone else in that mining party. And then I would have cashed in my share the second I got to a big enough city.

Anyway, book ciphers. If I were to implement a book cipher today, I’d make the book something available for download online. I’d put 30-40 ebooks on a tablet, and have a simple app to do the word counts for me and generate the letter table automatically. If necessary, I’d delete the table when I was done, and I’d have a system where different books would be used as the key based on the day of the month, or have the book title somehow incorporated in the ciphertext so that the message would tell you which book to use. Alternately, selecting books from copyright-free hosting sites would also work. But, maybe I wouldn’t pick “The Dancing Men,” “The Gold-Bug” or Doyle’s “The Valley of Fear” (which revolves around a book cipher using Whitaker’s Almanac).

I’ll mention here that book ciphers are often referred to on the net, and in fiction, as Ottendorf ciphers. The wiki talk page for book ciphers suggests that the name may come from Major Nicholas Dietrich, Baron de Ottendorf, “a German mercenary at the time of the American Revolution.” He worked for Major Andre, when Andre was negotiating with Benedict Arnold (using the Arnold cipher) in the failed attempt to surrender West Point to the British in 1780. Andre, and his superior, General Henry Clinton, were supposedly known to use book ciphers, but there’s no clear explanation for why they’re named after Ottendorf.

Specifically, Ottendorf ciphers consist of number triplets separated by hyphens (i.e. – 5-12-3). The first number is the line number for the page used for the cipher. The second value is the word in that line, and the third is the number of the desired letter in that word. The first number in the cipher can be the page used for that message.

Ottendorf is used in the film National Treasure, while the keys are Benjamin Franklin’s Silence Dogood essays. I have to admit that I haven’t seen the movie, and all I can find online is a clip on youtube showing the numbers being transcribed from the back of the Declaration of Independence, and a reference to the Dogood essays. The problem is in finding out which essay was used, and getting a photocopy of it (the online transcriptions of the letters are not formatted properly, and they don’t use Franklin’s original spellings.) At the moment, I don’t know if the movie shows a real cipher, or just fakes it.

The first few triplets are:
10-11-8 10-4-7 9-2-2 14-8-2 18-7-4 13-10-4 1-5-1 5-8-1

This supposedly corresponds to “The vision to see the treasured past, comes as the timely shadow crosses in front of the house of Pass and Stow”, but I can’t recreate that using what I have. The problem with Ottendorf in the computer age is that you need to use a source text where the line breaks don’t change when you resize the browser. That is, you really need a paper copy of the book or source text, or a scan thereof.

Now, for the Running Key Cipher. In fact this IS the Vigenere cipher, just using text from some desired book to create really long keys. And, this is just an extension of what I was alluding to in my paint mixer algorithm. (Note that by using a book key, RK is actually much, much easier to break than the Vigenere cipher is. See the next blog entry.)

To use the Running Key Cipher, pick a book, any book. You then want to tell your recipient what page and line number to start from. This way, you can reuse this book for as many messages you want, and just change the starting page and line numbers. In the key, you need to remove all spaces and punctuation, but retain letter repetitions.

I’m going to use Verne’s Journey to the Center of the Earth as my book, but because I’m using the online version at etc.usf.edu, I’m saying that Chapter 1 starts with page 1. I’ll then start with line 4, including the blank line in the line count. My cipher key will start with “Marthamusthaveconcludedthatshewasverymuchbehind”, and can be made as long as my plaintext. And my plain text is, “Cheese burgers are the cheesiest burgers I know.”

Marthamusthaveconcludedthatshewasverymuc
CheeseburgersarethecheesiestburgersIknow
———————————
OHVXZENOJZLRNETSGJPWKIHLPELLIYNGWMWZIZIY

We break this string up into groups of 5 (or whatever), adding padding as necessary:

OHVXZ ENOJZ LRNET SGJPW KIHLP ELLIY NGWMW ZIZIY

Finally, starting with A=0, we encode the 3-digit page number, and 2 digit line number. For page 1, line 4, this becomes AABAC. While the wiki entry suggests putting our encoded number in the second to the last grouping of the cipher text, like:

OHVXZ ENOJZ LRNET SGJPW KIHLP ELLIY NGWMW AABAC ZIZIY

we can actually use any method we like, as long as both sender and recipient agree to it beforehand. Personally, reversing the key (CABAA) and putting it at the beginning of the first five groups as follows, seems a little less obvious, assuming whoever intercepts our message also knows to read the wiki article.

COHVX AZENO BJZLR ANETS AGJPW KIHLP ELLIY NGWMW ZIZIY

Other variants of the Running Key cipher have already been covered in previous blog entries, such as using the running key for a binary substitution, XORing the individual bits of the plaintext with the individual bits of the ciphertext.

The wiki article also talks about “Ciphertext appearing to be plaintext.” I like this concept, because it makes the cipher look less like a cipher. The idea here is the classic military “Alpha Bravo Charley” substitution of words for individual letters. If you have a dictionary and a random number generator, the above text could turn into something like:

“Clearly other humans viewed Xerox as zero-entried neutral objects.” etc.

At a cursory glance, this “cipher as plaintext” just looks like a badly-written technical paper. It does NOT look like a secret message, and this makes the message somewhat less likely to be confiscated during a casual body search.

Summary:

1) Book codes use books as the code book, substituting location numbers for full words.
2) Book codes require the book to contain the words you plan on including in your plaintext, which restricts the kinds of books you can use.
3) Book ciphers substitute word or letter location numbers for the plaintext on a letter-by-letter basis.
4) Numbering can start from the beginning or end of the book, from any given page or line.
5) If your job includes the occasional body search, you can’t have the numbers written in the book, or tables of numbers on a sheet of paper. So, you have to recount all the numbers every time you need to encipher a message.
6) Book ciphers are time-consuming, and prone to miscounts.
7) They’re better for messages from HQ to operative, rather than the other way around.
8) They’re popular in fiction.
9) Pick ebooks, and have multiple texts on a tablet, plus indexing software.
10) Book ciphers are very hard to break if you use obscure books that can be easily downloaded from the net when needed.

11) Running Key Ciphers can use a Vigenere table, and then really long keys from a book.
12) RK ciphers are NOT book ciphers.
13) Unlike Vigenere, RK can be easy to break (see next blog entry).
14) The page and line numbers are encoded into the ciphertext.
15) All cipher types that consist of letter substitutions (“A-Z”, “a-z”) can employ “ciphertext as plaintext,” where individual cipher letters are replaced by randomly chosen whole words starting with that letter (i.e. – “a” = “alpha,” “a,” “at,” “Arnold.”)

Wiggles, part 12


Just a little ongoing story to give you something to play with until the next blog post.

AZT PISIDTJT TKWUNIVTDA UJD’A QTIVVL TKWUNIVTDA. IA RUXXTQTDA AUCTJ, AZT 100 LTD BFUD UJ IYFWA EFQAZ $1, YWA UA PWJA ZIJ “100” FD AZT XQFDA IDR I XVFETQ FD AZT YIBH. AZT XUQJA SISTQ YUVV UJ 1000 LTD, IYFWA $10, IDR AZIA ZIR DIAJWCT JFJTHU FD AZT XQFDA XFQ I EZUVT, YWA BZIDOTR AF ZURTLF DFOWBZU IXATQ 2004 FQ JFCTAZUDO. JFJTHU UJ FDT FX PISID’J YTJA HDFED EQUATQJ, YWA U IVEILJ ZIATR ZINUDO AF QTIR ZUC UD JBZFFV YTBIWJT ZT EQFAT UD I JAUXX, FVRTQ PISIDTJT JALVT AZIA UJ ZIQR AF EIRT AZQFWOZ. DFOWBZU EIJ I QTJTIQBZTQ AZIA RUJBFNTQTR AZT BIWJIAUNT IOTDA FX JLSZUVUJ, YWA CFJA FX ZUJ QTJTIQBZ EIJ XIHTR. YIJUBIVVL, PISIDTJT CFDTL RFTJD’A QTIVVL XTTV VUHT CFDTL. ET VUHT ZINUDO CFDTL, YTBIWJT UA CTIDJ AZIA LFW BID JSTDR UA. ET RFD’A VUHT AIVHUDO IYFWA CFDTL, YTBIWJT AZIA JZFEJ I VIBH FX BVIJJ. JF, UDJATIR FX YQIOOUDO IYFWA ZFE CWBZ ET CIHT, ET JZFE FXX EUAZ ZFE CWBZ ET JSTDR FD YUO-AUBHTA UATCJ VUHT BIQJ, ZFWJTJ IDR TVTBAQFDUBJ. IDLEIL, RUR CFC HTTS ZTQ XUQJA 100 LTD BFUD, FQ 1,000 LTD DFAT, JZT CIRT XQFC ZTQ YIQ? FX BFWQJT DFA. JZT WJTR UA AF YWL CFQT DISHUDJ XFQ AZT AIYVTJ. RF U ZINT CL XUQJA 1,000 LTD DFAT XQFC EZTD U JAIQATR YWJHUDO? FQ, RF U HTTS I 100 LTD BFUD UD CL JZFT XFQ “TCTQOTDBUTJ?”

Thinking About Encryption, Part 15


It’s time to move on. I’d like to look at a few very well-known substitution ciphers – the Dancing Men, Pigpen, Goldbug, and Jules Verne’s runes. There’s nothing particularly useful that I can add to them that no one else has said, except that I want to approach them as an actual font system. Arthur Conan Doyle wrote The Adventure of the Dancing Men as part of The Return of Sherlock Holmes between 1903-04. It’s a simple substitution cipher, and Holmes goes through the entire process of frequency analysis to show how to break it. The thing is, it’s much easier to assign letters (A-R) to the dancing figures, and then do the frequency counts on the those, especially if you use ASCII-based software ala En/De.

If you standardize the heights of the figures, you can save them as individual image files, and substitute the filenames for the symbols themselves, although I expect that making a simple figure grid and writing the program in Java may be the best approach for something like this. I’ve noticed that Blogspot.com messes up the spacing between the individual character images, but wordpress seems to handle the images just fine. It’s clumsy, having to save the image files to a hosting site and then link to them in a blog like this. It would make a lot more sense to keep the images on the local PC, and make the filenames short (i.e. – “a.jpg”, “b.jpg”, etc.)

Doyle didn’t use every letter in the alphabet, and he didn’t assign symbols to digits. I have seen full tables on the net for “A-Z” and “0-9”, but they may not be canon. In the name of full disclosure, I do have to say that I used one of the images I found on the net for most of the characters. But that image had errors and missing text from the Doyle story. I then went to the Gutenberg Project for error checking, and the scans used in the HTML version of the short story there had mistakes, too. I’m reprinting the Dancing Men ciphertexts below, with the image files used on a letter-by-letter basis (without the errors, I hope) just to illustrate how this would work if you had a Dancing Man font. You could do a frequency count on the .jpg filenames to break the cipher, if you wanted.

The only thing I should mention about a “font approach” to the Dancing Men is the trickiness of handling spaces between words. The idea is that a man holding a flag is followed by a space. Obviously, Doyle rigged his figures so that the only letters holding flags in his story were those that had their arms extended out to the right of the reader. That allowed me to create two dedicated flag markers, one with the flag at the top, and the other with the flag at the bottom. This approach won’t work for figures that don’t have their arms out in the matching position. The only real alternative is to have two sets of alphabetic images – ones with the flags and ones without. Enciphering will then require checking letter pairs – if the second character of the pair is a space, replace both of them together with the corresponding figure holding the flag. Inelegant, but that’s part of Doyle’s legacy.

A similar approach is exhibited with the Pigpen cipher. Again, just replace the symbols with the letters “A-Z”, and treat it as you would any random substitution cipher. As described in the wiki article, this is also known as the masonic cipher, Freemason’s cipher, Napoleon cipher, and tic-tac-toe cipher. If you do use a photo hosting site to hold the image files, it’s a simple matter to write a VBScript file to store the direct image links in an array, and just substitute them into an html file for uploading to the net, if desired. There’s nothing fundamentally different between the Pigpen, and the Dancing Men ciphers.

Poe published The Gold-Bug in 1843. In 1840, he supposedly challenged his readers to send him any monoalphabetic cipher they wanted, and he claimed to have cracked the over one hundred submissions he received (a claim that William Friedman (who I will talk about later) says was probably horribly exaggerated). The Gold-Bug was written in response to the American public’s interest in cryptograms. The story itself doesn’t stand up well with time, but it is one of the earliest works of fiction to feature a cipher as being central to the story, as well as showing Poe’s development as a detective writer. I’m having trouble finding scans of the actual book page with the cipher on it. Most of the images on the net are second-hand copies, and the text used in the HTML version of the story on the Gutenberg Project has mistakes in it. If you can find a copy of The Works of Edgar Allan Poe, use that. The reason is that Poe used a slight trick where some of the characters are jammed up against each other to show a sentence break, and this trick is not replicated in any of the electronic copies of the cipher I can find.

As with the Dancing Men, it’s easier to crack this cipher by substituting the characters with the letters “A-Z” and then running the results through a decipher app. You have to add the word breaks yourself, but that’s a trivial thing. One or two words might be difficult to solve because a couple of the letters only show up once each, but you can take guesses as to what they’re supposed to be. Otherwise, there’s nothing really difficult about the Gold-Bug cipher. There’s no real point in creating a special font for it, since the symbols and numbers can be found in the Unicode-16 character set.

This is the cipher on the official Edgard Allan Poe Society website, and in the wiki entry.

53‡‡†305))6*;4826)4‡.)4‡);806*;48†8¶60))85;;J8*;:‡*8†83(88)5
*†;46(;88*96*?;8)*‡(;485);5*†2:*‡(;4956*2(5*—4)8¶8*;4069285); )6†8)4‡‡;1(‡9;48081;8:8‡1;48†85;4)485†528806*81(‡9;48;88;4 (‡?34;48)4‡;161;:188;‡?;

This is the version from the Gutenberg Project HTML version of the story, with corrections.

53‡‡†305))6*;4826)4‡)4‡.);806*;48†8¶60))85;1‡);:‡*8†83(88)5
*†;46(;88*96*?;8)*‡(;485);5*†2:*‡(;4956*2(5*—4)8¶8*;4069285);
)6†8)4‡‡;1(‡9;48081;8:8‡1;48†85;4)485†528806*81(‡9;48;(88;4
(‡?34;48)4‡;161;:188;‡?;

Note that while the official version of the instructions is “TWENTY ONE DEGREES AND THIRTEEN MINUTES NORTHEAST”, the Gutenberg version says “FORTY ONE DEGREES AND THIRTEEN MINUTES NORTHEAST”. I’m not sure if this difference is because of changes in editions between printings, or if it was an intentional alteration made by whoever typed up the Gutenberg version of the story.

Any symbolic substitution cipher can be implemented easily in VBScript as long as the symbols are stored as individual images in a path that VBScript can reach. In the following example, I assigned the letters “A-z” plus some punctuation to the cards in a playing card deck. I hardcoded my plaintext in the script, ran through the string one character at a time, looked up the characters in a matching array, and output the ciphertext in the form of “img src=” statements that I incorporated into an html file. I then screencapped that, and cropped it to another .jpg file using Gimp. You wouldn’t want to do this to send serious messages to your recipient, but as a challenge to your readers it’s ok. As with any symbolic substitution cipher, it may help to replace the symbols with letters or numbers, and then apply frequency analysis as you would with a normal substitution cipher.

Jules Verne apparently loved ciphers, and used them in a few of his stories, although he really wasn’t an expert in them. The first case for this was the “runic cipher” that appeared in his 1864 Journey to the Center of the Earth.


(From the wiki page)

The runes are in 21 groups of symbols, initially arranged 3×7, which isn’t important to the code. Essentially, Verne treats the “Icelandic runes” as a substitution cipher, where almost all of the runes have a single corresponding western letter, although one of them is a double-upped “mm”. I’ve gone through the wiki entry on runes, and the unicode rune tables, and while some of the characters do match up to Verne’s transliterations, the one he uses for “mm” is pretty much “kk” in all of the tables I’ve seen. I also had trouble finding some of the other characters. So, either Verne used an “official” runic character set that I haven’t found yet (he writes that it’s “in the magnificent idiomatic vernacular”), or he made up some of the runes. In the latter case, we just treat them as a simple random substitution cipher into Latin. (Letter frequencies here.)

ᛯ.ᛦᚳᛚᛚᛋ ᛅᛋᛦᛅᚢᛅᛚ ᛋᛅᛅᚴᛁᚦᛚ
ᛋᛇᛏᛋᛋᛘᚩ ᚢᚿᛅᛅᛁᛅᚩ ᚿᛁᛏᚦᛦᚴᛅ
ᚴᛏ,ᛋᛐᛘᚿ ᛐᛏᛦᛐᛏᛅᛋ ᛋᚤᚨᚦᛦᛦᚿ
ᛅᛘᛏᚿᛐᛅᛁ ᚿᚢᛐᛅᚴᛏ ᛦᛦᛁᛚᛋᛐ
ᛆᛏᚢᛐᛐᛦ .ᚿᛋᚴᛦᚴ ᛁᛅᛐᛐᛒᛋ
ᚴᚴᚦᛦᛘᛁ ᛅᛅᚢᛏᚢᛚ ᚨᛦᛐᚿᛏᚢ
ᚦᛏ,ᛁᛐᚴ ᚭᛋᛅᛁᛒᚭ ᚴᛅᚦᛁᛁᛁ
(From the Unicode table)

A few of the runes are thicker in some cases than others, implying that they are upper case versions of those letters, something I didn’t bother trying to emulate with the Unicode version. And, I faked a couple characters that aren’t in the tables, which I hope isn’t that obvious at first glance. But, none of these little details really matter because Verne immediately follows his runes with his letter substitutions.

mm.rnlls esrevel seecIde
sgtssmf vnteief niedrke
kt,samn atrateS saodrrn
emtnaeI nvaect rrilSa
Atsaar .nvcrc ieaabs
ccdrmi eevtVl frAntv
dt,iac oseibo KediiI

I actually had a little difficulty understanding the next step, because Verne gives an unsupported hint that there’s a Scytale transposition here, but leaves the details to the reader to figure out. Essentially, we’re just to put the columns one after the other, like this:

mm.rnlls
esrevel
seecIde
sgtssmf
vnteief
niedrke
kt,samn
atrateS
saodrrn
emtnaeI
nvaect
rrilSa
Atsaar
.nvcrc
ieaabs
ccdrmi
eevtVl
frAntv
dt,iac
oseibo
KediiI

Then read down the columns one character at a time (“mm” = one character), top to bottom, left to right. Note that some groups are 7 characters wide, others are 6. Ignore the shorter groups when you run out of letters. This gives us:

mmessvnkasenrA.icefdoK.segnittamvrtnecertse
rrette,rotaisvadvA,ednecsedsadnelacartniii
lvIsiratracSarbmVtabiledmekmeretarcsilvcoI
sleffenSnI

This is the stage that gives the Professor the biggest headache, but to me it was automatic to ask, “is the message reversed?” And of course, it was.

InSneffelsIocvliscrateremkemdelibatVmbra
ScartarisIvliiintracalendasdescende,Avdaxviator,
etterrestrecentrvmattinges.Kodfeci.Arnesaknvssemm

A couple notes that may or may not be useful: Latin doesn’t have a letter “j”, it is often rendered as “i”; plus, “v” is generally treated as “u” (i.e. – Ivlivs Caesar = Julius Caesar). Plus, “k” turns into “qu.”

Making these substitutions and adding word breaks, we get:

In Sneffels Joculis craterem quem delibat Umbra Scartaris Julii intra kalendas descende, Audas uiator, et terrestre centrum attinges. Quod feci. Arne saknussemm

I’m noticing a few inconsistencies between the text in the wiki entry and that on Online-literature.com. There is definitely a mistake in the ciphertext on Online-lit, where “ccrmi” should be “ccdrmi.” This has a big impact on the transposition step.

Finally, the translation from Latin to English is partly a matter of taste. Online-lit gives it as:

“Descend, bold traveller, into the crater of the jokul of Sneffels, which the shadow of Scartaris touches before the kalends of July, and you will attain the centre of the earth; which I have done, ArneSaknussemm.”

In the wiki entry, it’s:

“Descend, bold traveller, into the crater of the jökull of Snæfell, which the shadow of Scartaris touches (lit: tastes) before the Kalends of July, and you will attain the centre of the earth. I did it. Arne Saknussemm”

In keeping with the symbolic substitution font theme of this entry, just go to the Unicode rune table and grab the characters you want to use, create a random substitution look-up table and knock yourself out.

Just some final notes regarding chronology and timelines. According to the wiki entry, the first recorded use of Pigpen was in 1531, it was referred to as “The Kabbalah of the Nine Chambers,” and originally used Hebrew characters.

The Vigenere cipher was first described by Giovan Battista Bellas in 1553.

The Freemasons started using a variant of Pigpen in the early 1700’s.

Supposedly, Thomas Beale hides a mass of gold and jewels somewhere in Virginia in 1822, and has three letters placed in a lockbox in an inn, with orders that it not be opened unless no one claimed it within 10 years. (See below.)

Edgar Allen Poe issued his monoalphabetic cipher challenge in 1840, and published The Gold Bug in 1843.

Vigenere was finally cracked by Charles Babbage in 1854 (it was being used at the time by lovers to send secret messages to each other through the classifieds in London newspapers).

George Washington’s army knew about Pigpen, and it was used during the American Civil War (1861-1865) by Union soldiers held in Confederate prisons. The Confederacy used a brass cipher disk to implement the Vigenere cipher, but it was constantly being broken by the Union.

Jules Verne published Journey to the Center of the Earth in 1864.

Lewis Carroll described Vigenere as “unbreakable in his 1868 piece “The Alphabet Cipher” in a children’s magazine.”

James B. Ward publishes The Beale Papers in 1885, describing Thomas Beale’s supposed treasure. Only letter #2 is ever deciphered.

Arthur Conan Doyle ran The Return of Sherlock Holmes between 1903 and 1904, which included The Adventure of the Dancing Men. Doyle then used a book cipher for The Valley of Fear (1914-1915).

While I haven’t talked about the Beale Cipher yet, I’m including it here to show that there was enough public interest in ciphers up to 1885 as to explain why the theory that James Ward may have published his pamphlet as a hoax to make a quick buck may hold true. He could have been imitating the plot line (i.e. – a search for buried treasure) from The Gold-Bug.

Wiggles, part 11


Just a little ongoing story to give you something to play with until the next blog post.

ZMYGUEVT QH HKEYVWZ… EC’Z UEVW QH HDVVN. IGXU AOYV E AGZ KYGJJN NQDVT, E AGZ ZECCEVT GC QVY QH COY IGXU CGIJYZ QH LN LQL’Z ZVGXU IGK, WQEVT LN OQLYAQKU, AOEJY LN LQL GVW WGW AYKY QSYK GC COY IGK. WGW AGZ VDKZEVT GV EXY CYG GVW LQL AGZ HQJWEVT VGMUEVZ COY AGN ZOY GJAGNZ WEW CQ IY MDC QDC QV COY CGIJYZ. QVY QH COY QCOYK ZYKSEXY LYV XGLY EVCQ COY IGK CQ CGJU QSYK G IYYK. COY TDN VQCEXYW COY JECCJY TQQW JDXU ZOKEVY OETO DM QV COY AGJJ QSYK COY WQQK OY’W RDZC XQLY COKQDTO, GVW ZGEW, “EZ COGC AOYKY NQD UYYM NQDK HEKZC WQJJGK?” LN WGW RDZC ZLEKUYW, IDC LQL JQQUYW G IEC XQVHDZYW GC COY FDYZCEQV. “AON AQDJW E UYYM LN LQVYN COYKY? EH NQD WQV’C ZMYVW EC, EC’Z VQC AQKCO GVNCOEVT.” NYGKZ JGCYK, E GZUYW WGW GIQDC COGC, GVW OY CQJW LY GV GLYKEXGV COEVT GIQDC OQA ZLGJJ ZOQM QAVYKZ AQDJW HKGLY COYN HEKZC WQJJGK, QK HEKZC CYV-WQJJGK IEJJ, COYN’W LGWY GZ G TQQW JDXU ZNLIQJ. COYV OY ZOQAYW LY ZQLY GLYKEXGV LQVYN, AECO COYEK MEXCDKYZ QH AGZOEVTCQV, GVW AOQYSYK COQZY QCOYK TDNZ AYKY. “HQDVWEVT HGCOYKZ,” OY XGJJYW OEL. ODO.