Jager Puzzle


Phil Foglio, creator of the Girl Genius webcomic, likes to occasionally make papercrafts, wallpapers and other things. Here we have an other thing – a papercraft puzzle featuring the jagermonsters. The idea is to rotate the cubes so that you only get one each of the images on each row. I had so much trouble with this that I actually wrote a VBscript to find the solution to this puzzle, to no effect. Then, I went back and corrected some errors, and the script located the first of the many possible correct solutions. So, yes, it’s difficult (for me), but it can be solved. Either way, I printed the PDF on 0.15mm plain stock, which turns out to have the right stiffness without being so thick as to throw off the fold lines on the images.

I’m going to bring this puzzle to my classes and see if any of my students can do it before they crush the paper up in frustration.

Colossal Gardner, ch. 7


The next article for Plane Geometry is on Penrose Tiling.

Unlike the earlier discussion of Rep-Tiles, which are periodic, Penrose Tilings are nonperiodic, and were first discovered by Roger Penrose. Periodic tilings can be made by shifting the pattern in specific directions without rotation or reflection. These are used extensively in the pictures by M. C. Escher. Nonperiodic tilings, at a minimum then, involve rotating or mirror reflecting the tiles. A very simple nonperiodic tile is the triangle. put two triangles back to back to form either a larger triangle, or a 4-sided rhomboid; the tiles themselves get rotated or flipped, but the patterns they make may themselves be periodic. Periodic tiles can be made nonperiodic by giving the edges different colors and requiring that only tiles with matching colors can be adjacent to each other.

Gardner laments that Escher died before learning about Penrose tiles, but I think M.C.’s butterflies print qualifies. Which is interesting to me, because Roger’s father, geneticist L.S. Penrose, invented the unending Penrose Staircase, which Escher depicted in his “Ascending and Descending” woodblock print. (Roger himself worked in general relativity and quantum mechanics.) In 1973, Roger found 6 tiles that are nonperiodic, with notches and tabs that forced the tiles to only fit one way. That was lowered to two triangular tiles, called “kites” and “darts”. Before making them public, he filed patents in the U.K., U.S. and Japan. You can now buy them as a game from Kadon Enterprises.


Penrose tiling, from the wiki entry. (All rights belong to their owners. Images used here for review purposes only.)

In 1993, John Conway came up with a 3D object that could be used to tile an enclosed volume. Called a “biprism”, these objects stack on each other, but in an irregular pattern. Currently referred to as a Schmitt-Conway-Danzer biprism, the cut-out pattern shown below was created for Gardner by Doris Schattschneider, co-creator of the M. C. Escher Kaleidocycles, and first female editor of Mathematics Magazine.


An example of tiling Penrose chickens (from the Martin Gardner book)

The Penrose Tiles are tied to the discovery that quasicrystals, crystals with five-fold symmetry, are possible.


Make your own Conway biprism

I tried to make 4 biprisms, and all four failed in exactly the same way. Granted, I was using very flimsy copy paper, but the final fold when I glued the arms together, always pointed at least 5 degrees off-center, making me think that maybe there’s something wrong with the original pattern. It might be better to use thicker paper stock, and then cut the sides into separate pieces and hold then together with cloth tape. Anyway, the instructions to make these are – use a dried-out pallpoint pen to score along all the lines. Then fold the arms marked “u” upward (valley folds) so that those arms pair up to make a prism on the top of the central rhomboid. And, fold the arms marked “d” downward (mountain folds) to have a matching biprism on the bottom side of the rhomboid.

You may have more fun playing with the Kadon Enterprises game sets, or with your own set of Penrose chickens.

The Code Book – Cipher 7, part 4


The computer fan is noisy, so I break the program, make note of the last permutation calculated, turn the laptop off and go to bed. The next morning, I get ready for work, restart the script where I left off and leave. I return in the afternoon, and I have a few more files stored with 5 hits on 3-letter strings, but the distributions are all wrong. That is, each file has one string that appears 5 times in the permuted ciphertext, but that’s it. There should be several strings in the list at 5 hits, not just the one. A couple of the permutations have one string that appears 3 times, and between 5-10 strings that occur twice each. And none of the 5- or 3- letter strings are bracketed by anything that could be treated as a space character.

The script still isn’t finished, and I’m falling back on the one remaining possibility – I’m doing something wrong with the ciphertext and corrupting the text I’m trying to permute. I’m thinking harder about printing out the script and the ciphertext and sitting down and working everything out by hand, from scratch.

But, I still have two things working for me. First, I’ve saved all my permutation frequency counting results files, so I don’t need to redo them again on the small chance that I’ve done it right. Second, I’ve had a sudden flash of insight on a weakness in this particular cipher regarding something that I really should have seen before.

100 years ago, when this cipher was first used, the idea of running all 40,000+ permutations of a long ciphertext by hand would have been unthinkable. With a modern computer, depending on exactly what it is you’re doing, it may take as little as 2 hours to complete a moderately complex task.

So… I talked about character frequency counts before, and the German alphabet frequency table. When I first ran my frequency counts, I was just looking at one pair of 2 characters at a time to see which sets of individual pairs to pick prior to matching them all together. That is, do columns 1 and 2 potentially look better together than columns 1 and 3 or 2 and 3? etc.

However, say I just pick the column order as 12345678, pairing up 1 and 2, 3 and 4, 5 and 6, and 7 and 8, build up my cipher string this way and count the frequencies of each pairing (i.e. “MU”, “UP”, “CU” and so on). Then I do the same thing for 21345678 (for “UM”, “UP” and “CU”), all the way up to 87654321. You may say, that’s a lot of work that just duplicates what I’ve done before without adding anything new.

And you would be right. But, let’s pretend that the true pairings are 25, 16, 74 and 83. But, I don’t know this yet. And let’s say that the plaintext is really “I eat codes.” Remembering that the keyword is 8 ciphertext characters long, the scrambling is going to actually be for every 4 characters. Even though the words I make will be all jumbled up if I get the order wrong (“I ea”, “t co”, “des.” Or, “aI e”, “ot c” “.des”. Or, “eaI “, “cot “, “s.de”) what WON’T change are the frequency counts of those letters.

In other words, if I run all 40,000+ permutations, and store the frequency counts to a file, somewhere in there will be 24 permutations of 25, 16, 74 and 83 (eg. 16257483, 74258316 and 83162574) which will ALL HAVE THE EXACT SAME FREQUENCY COUNTS.

All I need to do is write one more script to find the best matches (technically, there should be two sets of 24 permutations, because the flipped pairs 5, 61, 47 and 38 will also qualify). I do this thing. I run my script. And I get 1000+ output files. Why? Because I overlooked the fact that each batch of “wrong” permutations will also be the result of 24 pairings of the characters that show up on each set of rows. All I’ve done is race down another deadend. (Although, I could have actually done the frequency counting and kept only the permutations with the best distribution patterns.)

Sob. Sob. Yeah, cry me a river. I’m ready to ditch it all and go buy another Zelda game for my 3DS, but there remains only the last possibility that I’ve been putting off facing. That I did something wrong in putting the ciphertext in the table format. But… There’s one last last-ditch move I can make first. I’ve got access to the Swedish team’s notes, and they’ve got the deciphered final paragraph. That gives me the spacing between words. Assuming that the substitution table includes (and uses) spaces between words consistently, then I can do a character count for the first 10 words or so of the final plaintext paragraph and then use that to go through my ciphertext to search for any character that occurs at identical intervals. So, I write up another script specifically for this purpose, embed the permutation generator, and let it run all 40,000+ variants of the ciphertext.

No hits. Final failure of the sanity checks.

So, I go back to the Singh book and read the appendix. I pull out a pencil and notebook, and I very carefully follow each step. Drumroll – yeah, I’m kicking myself in the head now.

But… you know that old adage from Thomas Edison? “I didn’t fail. I just proved 10,000 ways that something doesn’t work.” (paraphrased.) Yes, I now know what a bad decrypting approach looks like. All the curves, all the frequency counts come out more or less linear. That’s good to know. Plus, I now have all my scripts pre-written, for frequency counting, 3-character word counting, finding bracketing characters, and the permutator.

All I need to do is change the section where I build up the transformation table, and that’s really easy. But now, I’m tired and grouchy, so I skip character counting and just jump right in on looking for three-character groupings and bracketing, brute-force. This takes about 5 seconds, so I know something’s wrong. That is, one of the very first permutations, 23456781, comes back with 65 hits on one three-character group, with brackets (i.e. – “pCy9p”) in the first 25% of the ciphertext string (I’m checking only the first 881 characters again, for speed). I look in the output file, and there’s lots of bracketing, and the distribution on the three-character strings is looking more reasonable. But still, the transposition key is too simple. So I run the script again and let it finish a few hours later. And sure enough, there are only two column permutations with really high 3-character counts, and they’re mirror reflections of each other.

I take this as a sign, but just to be sure, I run the space interval checker again. It hits on 23456781, and 15 other variants (“23456781”, “23457618”, “23457681”, etc.) I plug a couple bracket characters into the substitution table, and can’t really tell if the word breaks make sense or not in German. But, there’s one thing I do know, and that is German has the words “ein”, “eine” and “einer”. And I can do search and replaces on the ciphertext to run another sanity check. At this point, I’m using the key “23457618” with a semi-randomly generated look-up table (I just wrote out the characters A-Z, a-z, 0-9, . and *). All that remains is to start putting the correct characters in the correct places in the table.

It takes me a while to realize that certain words are coming out wrong (I’m using an online German dictionary at this point). A couple of the letters are doubling up (e.g. – I=n and o=n). That’s when I realize that, yeah, it’s still possible to have the 2-column pairings made right, and still have the pairs themselves out of order (“12345678” versus “12563478”). There’s one word I know is in the plaintext – “maschin”. I compile all 16 permutations of the ciphertext key, and then do a find in notepad on “maschin”. That locates the one column permutation that I want, which again turns out to be “23456781”. And then it’s just a matter of interactively displaying the text and plugging in values for the substitution table.

However, I don’t know German, and there are still a number of words that are incomplete that aren’t included in the Swedish team’s notes. I copy a large enough section of the plaintext that I do have done, and do a yahoo search. That’s when I come up with 20+ powerpoint presentations on the net for people that have also cracked this code, and have much more complete explanations for their work. Sigh. Anyway, I’m keeping my scripts. Who knows. They may turn out to be useful someday. In the meantime, there are only 3 ciphers from the Code Book contest that I haven’t tackled – one on the Enigma machine, and two for RSA encryption. I have no interest in RSA, and I’m too tired to tackle Enigma at this time. Maybe later in the Fall, or something.

The Code Book – Cipher 7, part 3


Getting back to the Code Book Contest cipher, we have the first line:

MCCMMCTRUOUUUREPUCCTCTPCCC

Doing a character count, the ciphertext is 7048 characters long. This can be factored as 8 x 881. Assuming that all of the columns are the same length, then there’s a good possibility that the key is 8 letters. Also, we can work on the additional assumption that this is a CEMOPRTU cipher (in accordance with Almgren and team’s solution paper.

1 2 3 4 5 6 7 8
—————
M C C M M C T R
U O U U U R E P
U C C T C T P C
C C C U U P C M
M P R T C C R U
P E C C M U U P
C M P E P P U P
U R U P P M E U
P U C E U U C U

Taking a bit more of the ciphertext, I started by numbering the columns 1-8, and breaking the ciphertext into rows of 8 characters each.

(Edit: This was my first mistake. I didn’t know which way to build up the table – in rows or in columns. I went with rows.)

On its face, without the key, I’m stuck. All possible permutations of the columns are equally likely because every position in the substitution table can appear in the ciphertext. In other words, I can’t tell if the key unscrambles as 17243568 or 78425316, or anything else in between. That gives 8!, or 40,320 different ways the columns can be arranged.

Except… The solution to stage 6 reads:
“THIS STAGE IS NUMBER SIX AND IT IS A PLAYFAIRCIPHER. THE FOLLOWING STAGE IS NUMBER SEVEN WITHIN THE CONTEXT OF THIS COMPETITION AND IT IS ENCRYPTED ACCORDING TO AN ADFGVX TYPE CIPHER. WHEN YOU ATTEMPT TO CRYPTANALYSE  STAGE SEVEN YOU WILL SEE THAT IT IS MORE COMPLICATED THAN A STRAIGHTFORWARD ADFGVX CIPHER. UNFORTUNATELY
I CANNOT GIVE YOU ANY MORE CLUES.

IN TERMS OF THIS STAGE THE SHIBBOLETH THAT YOU SHOULD TAKE NOTE OF IS MOLYBDENUM

YOU WILL PROBABLY HAVE NOTICED BY NOW THAT EACH STAGE IS IN A DIFFERENT CIPHER AND IS USUALLY IN AN APPROPRIATE LANGUAGE. SO THE VIGENERE STAGE WAS IN FRENCH AND THIS PLAYFAIR STAGE IS IN ENGLISH. IF THE PATTERN PERSISTS THEN THE NEXT STAGE WILL BE IN GERMAN. BUT DOES THE PATTERN PERSIST?”

(Note that “Shibboleth” refers to the code for stage 6 – “molybdenum”. Each of the 10 stages has its own code, and the winner of the contest was the first group to mail in all 10 codes to Simon.)

And, yes, the pattern persists – Stage 7 is in German. There are 22,355+ German words of 8 letters, so that part isn’t going to help me with the key. Except…

I now know what the expected letter frequencies should be. If I pair up 2 of the cipher columns, I can do a character count. Even though one column pair represents every 4th letter in the plaintext, the message itself is long enough that any given quarter of the text should still contain a good mix of characters that will come close to the German frequency table. Picking column 1, and permuting it with the other 7 columns, then counting the character frequencies is easy enough to do in a VBscript. Putting the output in Excel to graph the data visually should make the correct combination stand out clearly. Repeat the same steps for the other remaining 3 two-column pairs. There will still be the question of what order the 4 two-column pairs should be in, but that can wait for the moment.

So, I do this. I start with column 1 (MUUCMPCUP) and column 2 (COCCPEMRU) and get MC UO UC CC MP PE CM UR and PU. In this small subsample, there’s no duplications, but in the full text, UU occurs 63 times, UC 33 times, PU times 26, etc. I repeat the process for 1-3 through 1-8, and the pair with what seems to be the best turn out is… 1-2. I go to column 3 and run the permutations, and the best match seems to be 3-2, followed by 3-4. With column 5, it’s 5-1 then 54. And for column 7, it’s 7-6 followed by 7-2. Yeah, this is useless.

Plugging the numbers into Excel and looking at representative graphs gives almost identical-looking curves that aren’t that far off from the German frequency table. The problem is that the transposition chart is case insensitive, while the plaintext is mixed case, and until I break the code, I can’t match up characters for “a” and “A”, “t” and “T”, etc. Additionally, there are 64 characters total in the substitution table, smearing the percentages out and making the curves flatter.

Ok, doing things smarter isn’t helping me. Theoretically, I should be able to cut my work in half based on one flaw in the system. Because the CEMOPRTU table is a square, it is symmetrical along the diagonal CC to UU. Since I don’t know what the original table values were, and I don’t have the original key, it really doesn’t matter if I say EM = “E” and ME = “9”, or if I flip the table and claim EM = “9” and ME = “E”. All that’s important is that I get the plaintext out correctly. So the choice of column pair ordering “12”, “34”, “56” and “78” versus “21”, “43”, “65” and “87” is irrelevant and long as I remain consistent with all 4 pairs.

Fine. I make a stab in the dark and pair up 1 and 2, 3 and 4, 5 and 6, and 7 and 8, in that order. I then go to the next step, which is to arbitrarily assign random characters to the CEMOPRTU table, and convert MCCMMCTR to the characters from the table. There are two reasons for doing this. First, it cuts the number of characters in half (for example, “MCCMMCTR” becomes “CoyQ”.) Second, it makes things easier when counting 3-letter words.

Assuming that the plaintext was in English, one of the most common three-letter words in the language is “the”. On top of this, because the code table is assumed to contain “space,” or some other word delimiter, if there is any three-letter combination that occurs frequently, and is occasionally bracketed by the same character front and back, say “pH9Xp”, then at a minimum I can claim that that specific permutation of ciphertext columns deserves further study.

At this point, my program can’t handle stepping through permutations of the numbers 1-8, so I hardcode the string “12345678” in the program, build up the corresponding cipher string, look up the matching characters in the table, and do the frequency counts for 3-character combinations. I change the string to “34125678”, run the program again. Change to “56123478”, etc. for all 24 permutations (4*3*2*1 = 24). Two of the strings give me one 3-character “word” that shows up three times each. But, it’s not bracketed correctly. If it were, I’d make the assumption that the brackets represent “space”, and I’d make the substitution in the look-up table to do a reality check on the actual word spacing. But, that’s not working out right, either.

Great. Ok, if working smarter in any form at all isn’t helping me, I’ll work that much harder.

The first step is to figure out a way to automate generating permutations. It’s one thing to hand write 1234, 1243, 1324, 1342, 1423, 1432, 2134, 2143, 2314, 2341, 2413, 2431, 3124, 3142, 3214, 3241, 3412, 3421, 4123, 4132, 4213, 4231, 4312 and 4321, if you only have four digits. But at 8 digits, that’s 40,000+ values again. And that would take me a while.

Pseudo code:
s = “12345678”
ptr = 8
do while not done
store the character in s at ptr position to c as an integer
replace that character in s with ” ”
increment c
if c > 8, then decrement ptr
else
check if the new value of c is in s
if yes, keep incrementing c
if no, replace the character in s at ptr with the new value of c,
increment ptr and store 0 to c.
end if
if ptr < 1, reached last permutation – quit program
if ptr > 8, have built up complete new permutation – loop done

Say I have 1342 and I want the next number.
c = 2, and I make s = 134
c + 1 = 3, but 3 is already in s.
c + 1 = 4, but 4 is also already in s.
c + 1 = 5. 5 is greater than 4, so decrement ptr.
c = 4, and make s = 13
c + 1 = 5. 5 is greater than 4, so decrement ptr.
c = 3, and s =1
c + 1 = 4. 4 is not in s, so write 4 to s at ptr.
s = 14, c = 0, ptr = ptr + 1
c + 1 = 1. 1 is in s.
c + 1 = 2. 2 is not in s, so write 2 to s at ptr.
s = 142, c = 0, ptr = ptr + 1
c + 1 = 1. 1 is in s.
c + 1 = 2. 2 is in s.
c + 1 = 3. 3 is not is s, so write 3 to s at ptr.
s = 1423, c = 0, ptr = ptr + 1
ptr is = 5, which signals the end of generating the next permutation.

After a lot of debugging, I get the code to work as a function in VBscript. I put it in with the rest of my program, and start it running. The permutation string increments relatively quickly, so the brute force approach might be feasible. I walk away from the computer for a couple hours, then come back. I’m supposed to be storing the results to file for permutations with 3 hits on any 3-character combination, and nothing came out. I start digging into the code, and uncover so many bugs that I have to question just what exactly I was focusing on when I wrote the script, but I also realize that I made a fatal mistake. (Edit: Just not the CORRECT fatal mistake.)

Because I’d been working with the column arrays so much, I’d used 881 for the length of my check string, and not 7048/2 = 3524 like I should have. My script was only checking 25% of the full ciphertext. That, and I’d mistakenly been counting the  same 3-character strings more than once. But, the point is that by going from 881 to 3548 characters in the counting loop, the execution times went from 20 permutations per second down to a crawl at 1 permutation per 5 seconds. Trading off completeness for speed.

After verifying the code was working right, I started it running again, but I (intentionally this time) constrained the check length to the first 880 characters for speed. 3 hours pass, and I only get 6 files output, half of which are mirror images in the code table, and none have verifiable space bracket characters.

In theory, the German equivalent of “the” should show up in the first quarter of the plaintext, in all lowercase, surrounded by spaces, but that’s not what I’m seeing, out of 40,000+ total column permutations. I’m now facing two possible situations – either my sample space is too limited, and/or I’m doing something else wrong, too. (Edit: Duh.)

I change the string check length to 2000 characters, which drops me to 1 permutation per second. Estimated run time – 11 hours. Then I go do the dishes, and prepare for bed.

To be continued.

The Code Book – Cipher 7, part 2


Ok, we have our CEMOPRTU table, which we can use for the substitution part of the cipher. Again, the table is randomly generated, and both the sending and receiving parties have to have copies of it. If the plaintext is:
I EAT CODES.

Then the ciphertext for the above table would be:

EC UT CP CC MO UT CM ET CO CP MM UU

This brings us to Step 2.
The sender picks a key word. The longer, the better, but it can’t have any duplicated letters. That is, “JAMES” would be ok, but “JIMMY” would need to be stripped down to “JIMY”.

Write out the key word, and under that write the cipher text in the same number of columns.

JAMES
—–
ECUTC
PCCMO
UTCME
TCOCP
MMUU

Then reorder the key alphabetically, along with the associated the columns of text from left to right, top to bottom.

AEJMS
—–
CTEUC
CMPCO
TMUCE
CCTOP
MUMU

Now, it may be obvious here that we’re going to have problems recreating the plaintext if there are spaces on the last row. The recipient might be able to figure out what the ordering is supposed to be based on the key and key length. But, to avoid confusion, we could use “Q” or some other letter that could throw off the frequency counts for padding, and truncate to a half-character if necessary.

JAMES
—–
ECUTC
PCCMO
UTCME
TCOCP
MMUUMC

AEJMS
—–
CTEUC
CMPCO
TMUCE
CCTOP
MUMUM

I’ve seen (at least, I think I have, I’ll talk more about this is a second) two different interpretations of the next step. One is to read the ciphertext from left to right, top to bottom, and send that to our recipient via telegraph.

CTEUCCMPCOTMUCECCTOPMUMUM

The other, which is what’s given in Simon Singh’s book, is to read the columns from top to bottom, left to right. This is the more official step.

CCTCMTMMCUEPUTMUCCOUCOEPM

—–

As far as the sender is concerned, they’re done. So, we switch over to the recipient. The steps go in reverse, but I’m spelling them out in full detail for my own needs.

The recipient gets the ciphertext, as follows:

CCTCMTMMCUEPUTMUCCOUCOEPM

They write out the key in alphabetical order:

AEJMS

CCTCM TMMCU EPUTM UCCOU COEPM

And fill in the ciphertext, breaking it up into columns, assuming that they know how long the columns are supposed to be. The Swedish team commented that the first four characters repeated in the contest cipher, which may be something of a sanity check – if they don’t repeat, you broke up the columns wrong.

AEJMS
—–
CTEUC
CMPCO
TMUCE
CCTOP
MUMUM

They rearrange the key letters back to their proper places, and rearrange the ciphertext columns to match.

JAMES
—–
ECUTC
PCCMO
UTCME
TCOCP
MMUUM

Next, write out the rows from left to right, top to bottom. And, because this text has an odd number of characters, we can discard that last “M”.

EC UT CP CC MO UT CM ET CO CP MM UU

Finally, we go to our CEMOPRTU table, and use the first character of each pair for the row index, and the second for the column index, to get our plaintext back.

I (space) E A T (space) C O D E S
I EAT CODES.

Granted, I could have started out with mixed upper and lower case, and that would have been fine, too. What I really care about, though, is confirmation that I have both sides of the process under control. I can put this cipher text into my VBscript and get the plaintext back out correctly.

Because the next blog entry starts up with my just having the ciphertext, without the key or the CEMOPRTU table.

Edit: Meanwhile, about what I mentioned above regarding two ways of writing out the ciphertext before sending it. As I was starting out with this cipher, it seemed there wasn’t very much written about it on the net. I discovered later that it is pretty well-documented, but the pages are in powerpoint viewers that didn’t show up in a yahoo search. I’d thought I was lucky in finding a site written by a math teacher that gave a very simplified description of the original 5×5 table version. He only discussed the encoding stage, and not the decoding part. I am sure, although I haven’t gone back to check, that he used the left-to-right, top-to-bottom row stripping pattern for writing off the codes to a string, and not Singh’s more official top-to-bottom, left-to-right column stripping method.

I really wish I’d looked more closely at the appendix in Singh’s book, because I would have saved myself a world of hurt that way. End Edit.

To be continued.

The Code Book – Cipher 7


I originally had planned to not write anything about the Code Book contest ciphers unless a few people asked for it, partly because I don’t really need to have yet another thing on my plate when I still haven’t finished writing up the Martin Gardner Colossal Book chapters, and partly because I don’t know if anyone really cares about breaking ciphers that are already documented and have been on the net for 17 years. But, I did want to at least try my hand at the first few, and with the help of the explanations of Almgren and the other Swedes that won the contest, I’ve been able to get the solutions for stages 1-6 pretty easily (1, 2 and 4 I did completely on my own). Keep in mind, though, that the Swedish team didn’t spell out all of the answers in full detail on any of the stages. They gave hints, and brief descriptions of the steps they followed, but they didn’t specify actual keys, and in the case of stage 7 they didn’t give the first 80% of the plaintext, which would have made my job that much easier. As I’ve mentioned before, I’m not that good at breaking ciphers, so I want all the help I can get.

And this time, there’s not much help… Stage 7 is an ADFGV-type cipher, which I’ve never worked with before. So, I’m going to write down my thoughts as I go through this, as much for begging for help from someone else, as it is for trying to ensure that I understand what it is I’m doing right or wrong. First off, the ADFGV was developed by Lt. Fritz Nebel and introduced in 1918 for use by the German army on the western front in WWI. It’s a combined transposition and substitution cipher designed to reduce errors when the messages are sent in Morse code via telegraph.

Initially, the idea was to make a 5×5 table with 25 letters of the alphabet randomly assigned, with the letters ADFGV at the top and left side of the table. These letters were chosen because they have very different Morse code symbols and are hard to mistake for each other. Because the table has 25 letters, I and J can be doubled up, or you can drop Q or X. Now, take your plaintext message:

Ex.) I EAT CODES

Strip out the spaces,

IEATCODES

And substitute each plaintext letter with the row and column characters from the table (ie. – “I” = “DG”, “E” = “AV”):

DG AV AA GG AF FG AG AV GF

If we sent this message right now, it could be broken through a frequency analysis (not likely, since it’s such a short message, but anything longer would be a cinch.)

AV – appears twice,
everything else, once each. But, most of the less common letters in the English alphabet don’t appear at all.  So, that’s something.

A very simple way of making the code harder is to make the table bigger. So, we go to an ADFGVX table, which also makes this thing more useful for sending messages, too, since we’ve added digits to the cipher.

Ok, I’m now about to get to the contest cipher. If you look at that page, you’ll see a REALLY LONG cipher text that starts with:

MCCMMCTRUOUUUREPUCCTC

Now, this obviously is missing all the parts that include ADFGV and or X. However, if you do a frequency count, you’ll see that what we have instead are the letters CEMOPRTU, which if you put them at the top and left side of the table will give you an 8×8 table, for 64 possible characters. The Swedish team assumed (correctly) that the table would include all 26 uppercase letters, all 26 lower case letters, the digits 0-9, a space and an end-of-sentence marker (which we can treat as “. “). So, this is going to make for a heavier-duty substitution code, because there are now two possible values for any given letter (i.e. – “a” and “A”), but if uppercase isn’t used that extensively, nor are the digits, the lowercase letters will still fall to frequency analysis.

Ok, frequency analysis for beginners.
All written phonetic languages are made up of letters or characters that can be called the “alphabet” for that language. But, no matter which language you look at, certain letters tend to be used more often than others in the words that are a part of that language. For English, the most common letter is “e” (just count all the letters in this blog entry and check for yourself), followed by “t” “a” “o” “i” and “n”, giving us the famed Lovecraft detective, Etaoin shrdlu.

There are charts on the net for most major languages. You can find one for English here at the Cornell.edu site.

Pretending, just for a moment, that we haven’t done the transposition step yet for the CEMOPRTU cipher, then running a frequency count on the Contest cipher is going to result in something of a surprise – at 7000+ characters, the letter percentages in the text don’t match the Cornell English chart (ignoring also that THE most common character is probably going to be the space character between words). In fact, the plaintext is in German, where the most common letters are ENIRST (you can see a chart here at the sttmedia.com site).

Anyway, here’s the important take away for step 1:
1) Make a 8×8 table with CEMOPRTU written along the top and the left side.
2) Fill the table randomly with all of the upper and lower case letters, the digits 0-9 and the space and period.
3) Write down your plaintext message.
4) Encipher your message with the row and column pairs for each character in your plain text.

Note that both you and the recipient of your message need to have copies of the table. And that your recipient deciphers the message simply by taking 2 characters at a time, with the left character being the row and the right being the column, and recording the matching character from the table to reconstruct the plaintext.

Using the simplified ADFGV table above,
DG = I
AV = E
AA = A
GG = T
AF = C
FG = O
AG = D
AV = E
GF = S

To be continued.

The Code Book


(All rights belong to their owners. Images used here for review purposes only.)

The Code Book, by Simon Singh, 1999, Anchor Press
A couple of weeks ago, I was at the Junkudo bookstore near my apartment, and I had a couple hours to kill. Junkudo has a small section of English import books, with 2-3 shelves of science and math books. Apparently, quite a few of these have been in inventory for a long time, and the store has them marked 20% off to get rid of them. I wanted something science-y to read, and it was kind of a toss-up between Fermat’s Last Theorem, and The Code Book, both written by Simon Singh. I’ve got something of a soft spot for codes, and the one had a section on the Beale Cipher, so I picked The Code Book. Unfortunately, the original cover price was $20 USD, and even with the discount came out well over the $11 I could have gotten it for from  amazon.com. But, I would have gotten killed on the shipping costs. Besides, I wanted it right then. Such is the price of convenience.

The Code Book originally came out in 1999, and at the back were 10 ciphers that made up the Cipher Contest. My edition of the book was reprinted in 2000, after the contest closed, with the $15,000 USD prize being claimed by 5 Swedes, three of whom had been working on their PhD degrees at the Royal Institute of Technology, Stockholm, at the time. The reprint stated that the contest had ended, but the ciphers were retained for anyone that wanted to work on them themselves. More about this later.

The book is a mix of history and explanation of how the different types of codes work, with the narrative sections being used to give context to the development of increasingly sophisticated ciphers, and the associated methods for breaking them. The story starts out with the trial of Mary, Queen of Scots (1542-1587), on charges of conspiring to assassinate her cousin, Queen Elizabeth. At the heart of her defense was her claim that she’d been completely unaware of what her supporters were planning while she was isolated in prison. Unfortunately for her, the messages between Mary and her followers had all been copied by Walsingham, Elizabeth’s chief spymaster, who had his own cryptographer on staff, Thomas Phelippes. Not only did they know exactly what Mary was discussing with her followers, they forged a letter in her handwriting and in the same code, designed to rat out both Mary and the co-conspirators by name. The rest is, literally, history. The cipher Mary used was a simple letter substitution that could be easily cracked through frequency analysis. From here, the book goes through the invention of the telegraph, telephone, radio and the computer, with stories from ancient Rome, the Renaissance, both World Wars, and up through the creation of the RSA cipher and into speculation about the possible development and use of quantum computers, with a brief foray into the Beale cipher.

The Code Book is a good attempt to introduce readers to cryptography and cryptanalysis, while also providing stories for people that don’t care about the math. It’s not a perfect balance, because occasionally the tales about people like Alan Turing are just superficial snapshots, while the descriptions of how some of the ciphers work are needlessly over-explained. I felt the sections on RSA encryption and quantum computing were especially repetitious and wordy. On the other hand, I can also understand that some readers may need the systems spelled out at that level of detail. Regardless, I do like the book, and I recommend it to anyone interested in cipher history, as well as wanting to learn what the basic types of cipher approaches are.

Quickly, codes are word substitutions, such as when you say “Big Bear” rather than “Oracle Corp.”, and “Little Bear” for Larry Ellison. Ciphers are those systems where you make the text of a message harder to read, such as scrambling the letters around within the text (transposition), or changing A for P, X for E (substitution). Ciphers can be as simple as the Captain Midnight decoder ring (i.e., the Caesar shift cipher), or as convoluted as the Enigma machine and public key encryption. One of the things that I liked about learning how public key encryption works is in how it uses prime numbers, and how that ties to my blog entries on the Riemann zeta function. Also, it was amusing to me to see a mention of Martin Gardner and his Mathematical Recreation columns, where Martin had published the workings of the RSA system in his column, and the avalanche of requests for copies of the RSA paper from its authors that the article generated. (Everything is connected.)

Now, for the cipher contest. As mentioned above, there are 10 messages at the back of the book, ranging from easiest to hardest. Each one is based on one of the systems mentioned in the book, and all of them have already been cracked (if you want to cheat and look for the solutions on the net). The first one was very easy, and was just a random letter substitution. Funny enough, the copy of the message on Singh’s website was truncated, with 60% of the text missing. Even so, that was more than enough to work with. I tackled it initially with just paper and pencil, but when I realized that I didn’t have the entire text, I figured I might as well track the full message down from another website (I didn’t want to hand-type it), and then write a VBscript program to complete the rest of the print out for me. That was fun. The second message was much shorter, and was just a Captain Midnight decoder ring puzzle. I started that one on paper with pencil, too, but the simple substitutions just produced garbage when I tried plugging “the,” “and,” “are,” “was,” etc. into the most common 3-letter word. Then I hit on the idea of rewriting my VBscript program to brute force print out all 25 possible messages by “rotating the decoder ring.” That gave me 25 messages to wade through, but one specific message looked almost like real words. That’s when I realized that the plaintext was in Latin.

The problem is that I’m not particularly good at deciphering stuff, and I’m already busy with 20 other projects I want to finish. So, I don’t want to try tackling any of the other messages from scratch. On the other hand, I can write VBscript relatively well, and I do like seeing the deciphered messages coming out of my scripts (showing that I got the theory right). So, I went to the Swedish team’s website and read through their solutions. Then I wrote another script for decrypting message 4 – a Vigenere cipher. Even knowing what the solution was, I kept introducing bugs into my code, so it took a little over an hour to get it right. I followed that with message 3 (a random letter substitution, but with 6 letters representing “E” to throw off the frequency analysis; took 30 minutes for that script). Then #5 and #6. #5 was a simple book cipher (the count the words in a source text, and each number represents one letter from the text; 30 minutes) and #6 was a Playfair cipher. The Playfair message is pretty interesting, from my viewpoint, and I really did like seeing the plaintext come out at the end. This leaves #7 (ADFGVX) and #8 (Enigma). Then I’ll call it quits.

If there’s enough interest, I might consider writing a short blog series talking about the above ciphers, and how the VBscripts work.

One last comment: I went to the Amazon page for The Code Book, and I looked through the 1-star criticisms. There are only a few, which all complain that the Kindle version of the book is very low-grade and is missing all of the illustrations. So, I recommend that if you do buy the book, you get the paperback version.

Colossal Gardner, ch. 6


The next Gardner article in the section on plane geometry is on Piet Hein’s superellipse.

Piet Hein was a Danish mathematician, inventor, designer, author, and poet who lived from 1905 to 1996. He created the games Hex, Nimbi, Tower and the Soma Cube. For poetry, he mastered something he called a grook – a short rhyming aphorism. He wrote over 7000 grooks, which have been published in 20 volumes. You can find copies on amazon. I first encountered a book of grooks in the late 1970’s, early 80’s, and I liked them a lot.

Example from the wiki entry:
It may be observed, in a general way,
that life would be better, distinctly
If more of the people with nothing to say
were able to say it succinctly
– Piet Hein

In 1959, the architectural team tasked with rebuilding a congested section of old houses and narrow streets in the heart of Stockholm, Sweden, couldn’t figure out how to do the layout. They approached Piet Hein, who eventually came up with the idea of the superellipse – a curve that fits pleasantly into a rectangle without having the sharper deformations of an ellipse.

The general form of an ellipse is:
abs(x/a)^n + abs(y/b)^n = 1

As n approaches infinity, the shape becomes a rectangle. “a” and “b” are arbitrary constants that represent the semiaxes of the curve, and n is any positive real number.

Piet Hein settled on n = 2.5.

The superellipse has been used as the basis for furniture, park landscaping, and the patterns for Danish postage stamps. When made 3D out of solid brass, referred to as a superegg, it will easily sit on one end without falling over.


Superegg, from the wiki article.

Challenge of the week: Draw your own superellipse and stick it on stuff.

Gatorade Video


This is what science does when you’re busy looking under the couch for the remote.

Direct youtube link

Wednesday answer


Solutions to this week’s puzzle can be found in the wiki entry on rep-tiles.