Wiggles, part 38


Just a little ongoing story to give you something to play with until the next blog post.

LXU ZNNM DX HVGN FRXAZ UB IZ JVBVZ DX SZXA JUCD HXA WUZMVYNZDVE WXPNC (SIDCUZN) VZM RVQQXXZ MXFC (DVZUSI) VRN DX DHN QUEDURN HNRN. TXDH SIZMC XW QRNVDURNC VRN SZXAZ DX HVGN CHVBNCHIWDIZF VTIEIDINC, CUBNRZVDURVE BXANRC, V DNZMNZQL WXR HVZFIZF VRXUZM HUYVZ CXQINDL, VZM V EXGN WXR YNCCIZF AIDH BNXBEN. DHVD CIEGNRL YXGNYNZD TVQS IZ DHN TVR QXUEM HVGN TNNZ VZLDHIZF – V QVD AIDH V CEIQS QXVD, VZ VTZXRYVEEL VFIEN CYVEE QHIEM, XR V CHXRD WXP. I HVM YL MXUTDC VTXUD V DVZUSI, TNQVUCN DHXCN HVGN V RNBUDVDIXZ WXR TNIZF YXRN BEXMMIZF VZM TUESL. TUD, IW I AVC MNVEIZF AIDH CBIRID VZIYVEC, VZLDHIZF AVC BXCCITEN. XZ DHN XDHNR HVZM, I QXUEMZ’D IYVFIZN V QVD XR QHIEM QHICNEIZF DHN NZDRVZQN DX DHN QRNGIQN. VEYXCD IZCDIZQDIGNEL, I BUEENM XUD YL CUYVWX VZM TRXUFHD UB DHN BXSNYXZ FX VBB. ZXBN, ZX CIFZVE, VZM ZX EXXCN BISVQHUC. I BUD DHN BHXZN TVQS IZ YL BXQSND VZM QHNQSNM DHN VZIYVE DRVQSC VFVIZ. DHNL CNNYNM DX TN HNVMIZF CXUDH, DXAVRM DHN QEXCNCD NZM XW DHN CHXBBIZF MICDRIQD. I WVQNM IZDX DHN CEIFHD TRNNON VZM SNBD QRVAEIZF VEE WXURC. IW VZLDHIZF QVYN IZ HNRN VWDNR YN, ID’M TN BRNDDL XTGIXUC DX DHNY DHVD DHNIR CNQRND BEVQN HVM TNNZ IZGVMNM. DXX TVM WXR DHNY.

Thinking About Encryption, Part 41


Myszkowski Cipher

I’m starting to realize that there are a lot of transposition cipher alogorithms out there. I was convinced that the majority of cipher types are substitutions, but I keep encountering transpositions. The current one is Myszkowski, first proposed by retired French Colonel Émile Victor Théodore Myszkowski in 1902. It’s a variation on the Incomplete Columnar Transposition, in which the keyword is selected specifically to have at least two repeating letters. The plaintext is written under the key in order to create columns.

Example:
BALLOON
itisafo
ulwindt
blowsno
good

Rearrange the columns in ascending numeric order, treating subsequent repeating letters as next in order.

ABLLNOO
tiisoaf
luwitnd
lbowosn
ogod

Now comes the slightly tricky bit – read off the text as rows for the duplicated letters, from top to bottom, and just as columns for the letters that aren’t duplicated. Group in fives out of convention.

tlloi ubgis wiowo dotoa fndsn

That is, we start with “A”, which gives us “TLLO”
Then “B”, for “IUBG”
Next, we get “LL”, for “IS” + “WI” + “OW” + “OD”
Followed by “N”: “OTO”
And finally, “OO”: “AF” + “ND” + “SN”

Deciphering pretty much follows the pattern in reverse. It helps to take the ciphertext and write it out under the keyword to form each of the columns, then sort the columns in ascending alphabetical order, and write in the cipher text again following the rules for duplicated letters. Finally, return the columns to keyword order and read the text off in rows.

The real challenge is in developing a good brute-force method for creating potential keys for emulating the reoccurance of letters in a word. In one older ACA computer column, the author states that the two approaches he knows of are to index a dictionary to create a database of existing words with duplicated letters; and, using a hill-climbing routine. Because the article itself was intended to demonstrate a hill-climbing program for solving Myszkowski ciphers, that’s what he spent his time describing.

Because I already have a dictionary built up on my computer for doing word searches for Aristocrats (newspaper crypto-quips), I figured that I’d try the database approach. I spent a week thinking about how I’d write this, so when I did finally sit down to start working, I had an idea of how to begin. But, I was hoping that I’d be able to create a key generator along the lines of my permutating counter (123, 132, 213, 231, 312, 321), which would go something like:

1112
1121
1211
2111
1122
1123
1212
1213
1221
1222
1223
etc.

The basic rule is that there has to be a base letter (1), and at least one letter higher than that, but with no gaps. Because, 1112 is functionally the same as 1113, there’s no point in allowing keys with gaps between the individual values. I’d also thought that I could pre-generate this list and save it to a file because it’s pretty short (for 4-letter words). Then, I’d be able to cut the list in half by eliminating strings that are left-right reflections of each other. In the solver itself, I could just do key = reverse(key) to recreate the reflections.

Everything went well for generating 4-, 5-, 6- and even 7-letter key files. Unfortunately, when I got to 8, everything blew up on me. When the key count got up over 200,000, I killed the script and went back to the dictionary approach.

First, pick a width (8), and find all words that are that many letters long. Next, build a “skeleton pattern” for each word. That is, for “balloon”, we can use “2133554”. Store that to an array, and append the word (balloon) after the pattern. (i.e. – “2133554|balloon”). For any other words that have the same pattern, append them to the end (“1223|beet;boot;coot;foot”). And repeat this until finished.

This approach does reduce the number of words the solver needs to check. And actually, this is a good way of learning a bit about English language statistics. The following shows the number of words for given widths that have repeating letters, and the number of unique patterns for those widths.

Width Total Unique
6 ---- 5419, 1809
7 --- 11661, 7374
8 --- 19582, 16959
9 --- 25815, 24524
10--- 28707, 27932
11--- 26790, 26307
12--- 22214, 21929

The solver, then, has three parts. Part one determines the width of the target keyword (either in a for-loop, or hardcoded), and pre-builds an array of the correct size to establish column lengths.

01 02 03 04
05 06 07 08
09 10 11 12
13 14 15 16
17 18

Part two loads the patterns file for the target width and then walks through each pattern one at a time. This part is important, because when we reorder the columns, we could get something like:

03 01 04 02
07 05 08 06
11 09 12 10
15 13 16 14
.. 17 .. 18

I ran into a problem because I wasn’t paying enough attention to those blanks in the middle when I generated the output string of encrypted positions (which I separated by semicolons):

03;07;11;15;;01;04;05;08;09;12;13;16;17;;02;06;10;14;18;

That really threw things off when I applied the third part, which is the “unraveller” function I apply to the ciphertext combined with the encrypted positions to try to generate the plaintext. The ACA CONs (constructions) generally include cribs, so I just check whether the unravelled output contains the crib. If it does, I print the key, the skeleton pattern and the output string and quit. I fixed the above problem by using string replacement (key = replace(key, “;;”, “;”)).

I haven’t tried doing time comparisons with solving other transposition ciphers, but my Myszkowski solver does seem to be bogging down, even with the simplified pattern database files. It can take several minutes to go through 16,000-24,000 words before ending without finding a match. On the other hand, it does still find most solutions when I know the correct key length for the CON in advance (either it’s given as part of the CON, or I just cheat and look at the answers from past newsletters). Myszkowski Transpositions aren’t that popular, and appear maybe 2-3 times a year, so I’m not that compelled to do a lot of testing right now. After checking 6 CONS, the solver initially failed to find 5 solutions. The first failure was because the newsletter editor used the number zero (0) instead of the letter (O) in the online version of the ciphertext. Two failures were because of the blank column cell issue mentioned above, which I did eventually catch and correct. And the final two failures were because the authors used keywords that weren’t in the dictionary (RECHERCHE, and something that was a combination of 2 words). There’s not much I can do about the use of foreign words or key expressions, short of switching over to hill-climbing. And I’m not prepared to try that yet, because I’m still unclear on the concept. Maybe later.

Anyway, the solver works on the majority of the ciphers I’ve tried it on, and since it’s not all that popular a cipher type, that’s good enough for me right now. And I’m pretty happy with the fact that it only took maybe 3-4 hours total to write the dictionary-to-skeleton database converter, a stand-alone encrypter and the solver. Plus another 3-4 hours to test and debug the solver. Not too bad.

Summary:
1) Myszkowski is a transposition cipher based on Incomplete Columnar.
2) It uses keywords that have at least one repeating letter.
3) The plaintext is written in rows under the keyword, then the columns are rearranged in ascending alphabetic sequence, with the repeating letters arranged in order of appearance in the word.
4) The ciphertext is read off in columns, with repeating letters grouped as small rows.

abbcc
-----
12345
6789A
BCDEF

= 16B23 78CD4 59AEF

5) Deciphering is the reverse process, but it helps to prebuild the column lengths under the keyword first before populating the matrix with the ciphertext.
6) Cracking Myszkowski can either be done through a hill-climbing algorithm (fine for cases where the keyword is from a foreign language, or if it uses short phrases), or by simply using words from the dictionary.
7) The dictionary approach can be speeded up by presorting for words with duplicated letters, turning them into a skeleton pattern and then storing them in separate files for each keyword length.
Ex: BAMBOO = 213244
8) Myszkowski can easily be broken for keys that use single English words, if the plaintext is expected to be in English. To harden it, either use foreign words or key phrases made up of more than one word, or run it through something else, like Rail Fence, or a Route Transposition.

Wiggles, part 37


Just a little ongoing story to give you something to play with until the next blog post.

YGPLYP, NRH JEYVH YBDAH NRH VDGVQHNH OTGHJ LYJ LTMHQ NRYG NRH VQHATVH, BCN GDN YJ NYOO. T LYJ JNTOO JNCVW RYATGF ND VQYLO DG IP RYGMJ YGM WGHHJ. MYIG, YOO DX NRH IDGHP T’M IYMH NDMYP LYJ FDTGF ND FD NDLYQMJ BCPTGF GHL VODNRHJ. NRH JEYVH QYG GDQNR-JDCNR CGMHQ NRH IYTG JRDEETGF YQVYMH LYOWLYP RTFR YBDAH IH. T NCQGHM DXX NRH XOYJROTFRN YFYTG, YGM LYTNHM ND YMUCJN ND NRH MYQWGHJJ. NRH YTQ LYJ ICJNP YGM VDDO, LTNR Y JOTFRN BQHHZH XODLTGF ND IP QTFRN. NRHQH LHQH GDTJHJ – JODFFTGFJ YGM JSCHYWJ, LRTVR VDCOM BH IYMH BP LRYNHAHQ LYJ XODLTGF TG NRH ETEHJ BHGHYNR IH, DQ ATBQYNTDGJ TG NRH VDGVQHNH ETEHJ. GDN AHQP ODCM, BCN GDNTVHYBOH TX PDC VDGVHGNQYNHM DG NRHI. GD OTFRN, NRDCFR. T NCQGHM NRH XOYJROTFRN BYVW DG YGM TGJEHVNHM IP “XODDQ” IDQH VODJHOP. NRHQH LHQH IYQWJ TG NRH JYGM, SCTNH Y XHL, YVNCYOOP. NRTJ LYJ OTWH Y ITVQD-NRQDCFRLYP, LTNR LRYN ODDWHM OTWH VYN EYL EQTGNJ, VRTOMQHG’J JRDHEQTGNJ, YGM MQYF OTGHJ. IYPBH NRH OYNNHQ LYJ IYMH BP JDIHDGH ECOOTGF Y BYF DQ JYVW BHRTGM NRHI. NRH IDJN QHVHGN IYQWJ, NRH DGHJ DG NDE DX NRH DNRHQJ, ODDWHM YGTIYO-OTWH. OTNNOH EYMJ, YGM JRYQE VOYLJ. GDN XQDI Y VYN, NRTJ QHITGMHM IH IDQH DX Y QYVVDDG. T MDG’N WGDL LRYN XDK EQTGNJ ODDW OTWH, BCN T VDCOM BHOTHAH HTNRHQ DGH.

Thinking About Encryption, Part 40


It’s been a long time since I’ve talked about Autokey ciphers, and I’ve only just recently actively solved one from the ACA Cm newsletter. The Nov.-Dec. 2018 issue of the Cm had two Autokey CONS, and I’d figured I’d try solving both of them just because I’d thought I could. However, it had been so long since I’d looked at it, that I’d forgotten that the VBScript I’d written was just for generating Autokey ciphers for the blog, and wasn’t actually a solver. So, I pretty much had to start over from scratch.

The thing is, the other scripts I had for solving Vigenere had everything I needed for loading the tabula rosa tables for Vigenere, Variant and Beaufort (which can be used with Autokey), and for loading and prepping the cipher text message. That left adding the functions for printing out key strings and plaintext for specific pieces of ciphertext, and then automating the process for partially solving the CON when given the placement of the crib and the width of the primer. Overall, I think I spent no more than 2 days on everything.

Recall that with the Vigenere cipher, you’re using a lookup table that has an index key at the left, and is Caesar-shifted one character to the left per line.

A ABCDEFGHIJKLMNOPQRSTUVWXYZ
B BCDEFGHIJKLMNOPQRSTUVWXYZA
C CDEFGHIJKLMNOPQRSTUVWXYZAB
D DEFGHIJKLMNOPQRSTUVWXYZABC
E EFGHIJKLMNOPQRSTUVWXYZABCD
:
H HIJKLMNOPQRSTUVWXYZABCDEFG
:
:
Z ZABCDEFGHIJKLMNOPQRSTUVWXY

First, we need to pick a keyword, say, “HELPME”, and then write the plaintext under repetitions of that.

HELPMEHELPMEHELPMEHELPMEHEL
inthebeginningiwasaquietman

To create the ciphertext, we take the plaintext one letter at a time, and locate it in the top row. We take the matching key letter at the left of the table, and look for the letter at the intersecting row and column we’ve formed. For “i”, the key letter is “H”. Going down the “i” column and across the “H” row, we get “P”. For “n” and “E”, we get “R”. Etc.

Variant and Beaufort work the same way, they just use different letter arrangement tables.

The main weakness with all of these cipher types is that the key is periodic, and all of the letters encrypted by a specific letter are all part of the same Caesar-shifted alphabet. Meaning that if we can determine the key length, we can apply simple letter frequency matching to that alphabet to get the shift value for recovering the plaintext. The way around this is to make the key the full length of the plaintext message.

With Autokey, we do this by picking a “primer,” a word or phrase that starts our keystring, and then append the plaintext message to the primer, so that the text is encrypting itself. Autokey can use the Vigenere, Variant, or Beaufort tables. The advantage here is that it’s as easy to remember the primer as it is the keyword, but we don’t get something that’s as periodic.

Example:
Primer: machine
Plaintext: inthebeginningiwasaquietman
Keystring: MACHINEINTHEBEGINNINGIWASAQ

Using the Vigenere table, the ciphertext is (with the letters in groups of 5):
UNVOM OIOVG UMOKO ENFID AQATE AD

Now, to break an Autokey cipher, most of the articles I’ve read so far work on the assumption that the crib (the hint given to help solve the cipher) is relatively long compared to the primer. This means that the crib is going to overlap itself somewhat between the plaintext and the keystring. We can see this in the above example for the word “beginning.” The idea is to encrypt the crib with itself, shifting the word on the line below to the right one position each time, and then checking to see whether the encrypted crib text appears in the cipher.

beginning
.......beginning
.......OK.......

“OK” does appear in the cipher, once, at position 13. We’ve now placed the crib, and we know that “beginning” starts at position 6. If we put the keystring we reconstruct above the ciphertext, and the reconstructed plaintext below, we get:

............BEGINNING......
UNVOMOIOVGUMOKOENFIDAQATEAD
.....beginning.............

It’s just a matter of working backwards in the alphabet table to finding the missing corresponding plain or key text. The more text we get, the more we have for completing the rest of the message.

In the articles I’ve read, the authors focus on reconstructing the plaintext, first working from the crib to the right. After they reach the right end of the text, they return to the crib, and work left to finish recovering the plaintext, and finally revealing the primer.

That’s all well and good, but what if the crib is short compared to the primer? That is, what if the crib had been “the”? Now, there’s no overlap between the plaintext and the keystring, and the above method doesn’t work.

I’d like to propose a more generic approach than that be used in both the “long” and “short” crib cases. We start out with the ciphertext below, and the crib “have”. And, I’m using the Vigenere table.

QUVDW ITPGZ LTOSW ZMEYR HFOJP ARYVJ K

The entire point of the crib is to give us a word from the plaintext that we can use to crack the puzzle (in the real world, we’d like to hope that we have a lot more source material to work from). This means that we KNOW the crib is in both the plaintext AND the keystring. We can just slide the crib along the ciphertext, and write out the corresponding plain and key values, and see what that gets us. For solving for the key, for the first letters, “Q” and “h”, I get “J”. Continuing the process for the first 10 slide positions:

Pos = 1, QUVD, Built from crib: JUAZ
Pos = 2, UVDW, Built from crib: NVIS
Pos = 3, VDWI, Built from crib: ODBE
Pos = 4, DWIT, Built from crib: WWNP
Pos = 5, WITP, Built from crib: PIYL
Pos = 6, ITPG, Built from crib: BTUC
Pos = 7, TPGZ, Built from crib: MPLV
Pos = 8, PGZL, Built from crib: IGEH
Pos = 9, GZLT, Built from crib: ZZQP
Pos = 10, ZLTO, Built from crib: SLYK
Pos = 11, LTOS, Built from crib: ETTO
Pos = 12, TOSW, Built from crib: MOXS

Now, one weird artifact that appears in Vigenere that’s not in Variant or Beaufort, is that solving for the keystring actually gives you the same result as for solving for the plaintext. Case in point, the below list is for the plaintext:

Pos = 1, QUVD, Built from crib: JUAZ
Pos = 2, UVDW, Built from crib: NVIS
Pos = 3, VDWI, Built from crib: ODBE
Pos = 4, DWIT, Built from crib: WWNP
Pos = 5, WITP, Built from crib: PIYL
Pos = 6, ITPG, Built from crib: BTUC
Pos = 7, TPGZ, Built from crib: MPLV
Pos = 8, PGZL, Built from crib: IGEH
Pos = 9, GZLT, Built from crib: ZZQP
Pos = 10, ZLTO, Built from crib: SLYK

Ignoring that, looking for text fragments that look English-like, we get “NVIS” at position #2, and “ETTO” (maybe for “get Tom”?) at position #11. If we build up what we have (keystring on top, cipher in the middle, and plaintext below, we have:

.NVIS.....HAVE
QUVDWITPGZLTOSWZMEYRHFOJPARYVJK
.have.....etto

If we make the assumption that ACA Autokey CONs use primers shorter than 11 letters, then because “NVIS” is at position 2, and “ETTO” is at position 11, we get a tentative primer length of 9. And here’s my point – instead of continuing to the end of the message and then guessing at words in the plaintext and keystring, and trying to fill them in before tackling the primer, why not attack the primer NOW?

Using an online Scrabble-like word finder for 9-letter words fitting the pattern “?NVIS????”, we get envisaged, envisages, envisions, invisible, invisibly and unvisted.

Testing “ENVIS”, the result is “MHAVE”. Testing “INVIS”, the result is “IHAVE”.

Boss, I think we have a winner. Punching in a primer of “INVISIBLE” causes the entire cipher to crumble.

INVISIBLEIHAVEASECRETTOWHICHNOM
ihaveasecrettowhichnomanisprivy

In summary, Autokey has a major flaw, in that if you can place a crib in both the keystring and the plaintext, you’re given the length of the primer. If necessary, you can work towards the left to uncover bits of the primer, and possibly use that with an online word finder to obtain potential primers that you can then plug into an Autokey solver. If the list is short enough, you’ll crack the cipher faster than if you follow the conventional methods.

Wiggles, part 36


Just a little ongoing story to give you something to play with until the next blog post.

D QFPFPHFQFR EFUQDNM JQSP PG PSLEFQ ZSPF LDPF HUXT LEUL DN LEF MQFUL EUNZEDN FUQLEBWUTF DN 1995, SNF SJ LEF PSQF ZESXTDNM RDZXSOFQDFZ KUZ LEUL LEF XSPAUNDFZ LEUL AWL DN LEF ZWHKUGZ EURN’L JSYYSKFR LEF AQSXFRWQFZ LEFG’R XYUDPFR LEFG EUR, DNZLFUR SJ HSQDNM LEF QUDY LWNNFYZ UNR AWLLDNM DN QFDNJSQXDNM, LEFG’R VWZL RWM YSNM LQFNXEFZ, AWL DN LEF AYULJSQPZ UNR LQUXTZ, UNR ASWQFR RDQL SN LSA SJ FOFQGLEDNM. PUGHF LEUL’Z KEUL EUR EUAAFNFR EFQF. LEF XSNZLQWXLDSN XQFKZ EUR RWM LEFDQ LQFNXE, YSKFQFR DN YUQMF XSNXQFLF ADAFZ, UNR JDYYFR DN LEF ESYF KDLE ZUNR. UJLFQKUQR, ZSPFLEDNM EUR XSPF HUXT LEQSWME UNR ZXSSAFR SWL U LWNNFY DN LEF AUXTFR ZUNR QWNNDNM UYSNM LEF LSA SJ LEF ADAFZ. D AWYYFR PGZFYJ WA DNLS LEF ASXTFL, UNR XQUKYFR JSQKUQR LS LEF XFNLFQ SJ LEF ZFKFQ QSSJ. D EUR LS XSQQFXL PGZFYJ – D ZLDYY RDRN’L TNSK KEUL LEFZF ADAFZ KFQF JSQ, HFXUWZF ZFKUMF YDNFZ KSWYR EUOF LS HF UNSLEFQ PFLFQ SQ LKS RSKN, UNR QUDNKULFQ QWNSJJ KSWYRN’L EUOF LS HF LEDZ RFFA. DL KUZ VWZL SRR.

Baconian Tarantulas


Just a little experiment I tried with the Baconian cipher. The four main rows in the center use 4 separate keys, while the group at the pentagram corners use one key. Kind of a tip of the hat to Voltaire.

If you can figure it out, leave a comment.

Thinking About Encryption, Part 39


Amsco is another one of those ciphers used by the ACA that doesn’t have much written on it on the net (nothing on wikipedia), but for which several people have written online solvers. So, I don’t know where the name came from, or if it’s been used much historically or in fiction. [Edit: Thanks, limax! AMSCO is apparently named after its creator, A.M. Scott, and it appeared in the Elementary Cryptography book written by Piccola.] The cipher itself is a modification of Incomplete Columnar Transposition.

Recall that ICT consists of picking a numeric key (sequential digits, no duplications; e.g. – 624135), and writing the plaintext under the key in rows. Rearrange the columns to be in ascending key digit order, and read off the columns from top to bottom, left to right.

3142
----
this
isat
est


1234
----
hsti
stia
e.et


hsesttieiat

Where Amsco differs from ICT is that it breaks the plaintext up into alternating monomes (single letters) and dinomes (2-letter pairs). The first “cell” of the “table” we’re creating can contain either a monome or a dinome. After that, they alternate both in rows and in columns.

3..1..4..2
-----------
t..hi.s..is
at.e..st.m
e..ss.a..ge
he.l..pm.e
n..ow

Reading off the columns in 1-4 order in groups of 5 (which is a tradition, only):

hiess lowis mgeet atehe nssta pm

Note that it’s not necessary to pad out the end of the plaintext to form a complete rectangle, or to get the final dinome to be a 2-letter pair.

The above example started with the monome. If we start with the dinome, we get:

3..1..4..2
-----------
th.i..si.s
a..te.s..tm
es.s..ag.e
h..el.p..me
no.w

itese lwstm emeth aeshn osisa gp

Deciphering the ciphertext is a bit more tricky. I think the simplest method is to count the length of the ciphertext, and then build up the table as if you were preparing the plaintext for encryption. However, instead of filling the table with the message, just enter 1 (for monome) and 2 (for dinome).

3..1..4..2
----------
1..2..1..2
2..1..2..1
1..2..1..2
2..1..2..1
1..2

This tells us that the first 7 letters are going to go into the 1’s column, the next 6 into the 2’s column, etc., as well as how many letters per cell. Just for the example, I’ll type the 1’s column. If the message isn’t padded, or if the final dinome comes out short, we can deal with that by checking the message length and applying the correct length for that last “cell” when we write it out.

3..1..4..2
----------
1..hi..1..2
2..e..2..1
1..ss..1..2
2..l..2..1
1..ow

hiess lowis mgeet atehe nssta pm

You can compare this to the first table to see that the two are going to turn out to be the same. The plaintext is then read off in rows.

Breaking Amsco is going to be a lot like breaking ICT. In fact, I copied my ICT solver script, and just gutted the part that handles encrypting the column data. I prefer using a brute force software approach, in which I pick a starting key (e.g. – “1234”) and use that to encrypt the positions of the plaintext message:

1..2..3..4
----------
1..2..3..4
5..6..7..8
9..10.11.12
...

1,5,9,2,6,10,3,7,11,4,8,12

I then apply the ciphertext to the positions and “unravel” the message and check if it produces anything legible (by both counting how many common 3- and 4- letter words are formed, and if the result contains the crib word (if any). If it’s the wrong key, I just increment it (1234, 1243, 1324, 1342, 1423, 1432, etc.) and try again until I hit on the right key.

However, for Amsco I need to include the monome/dinome widths as well. And, if I have the widths, then the position numbering in the table is going to be non-sequential.

.1.....2.....3.....4
-----------------------
.1(2)..3(1)..4(2)..6(1)
.7(1)..8(2).10(1).11(2)
13(2).15(1).16(2).18(1)

1(2), 7(1), 13(2), 3(1), 8(2), 15(1), 4(2), 10(1), 16(2), 6(1), 11(2), 18(1)

This looks incredibly clunky as written here, but what it means is that to recreate the plaintext, I’m going to count off the first character in the ciphertext, and write down characters #1 and #2. Then I’m going to count to the 7th character, and write down #7. I’ll follow this with #13 and #14, #3, #8 and #9, and so on. If the ciphertext is:

hiess lowis mgeet atehe nssta pm

I’ll get something like:

hioeeewo

Which is obviously wrong. But, that just means that I have the wrong key. I increment to 1243, try again, increment to 1324, try again, and eventually, when I hit 3142 I’ll have my plaintext out.

Now, a disclaimer. If the key period is long enough (7 or larger), it is possible that certain cribs could appear in otherwise garbled outputs when the key is close enough to being correct (say, at 5467321 and 5467123). Because of this, I’m not stopping my script when the output does generate something containing the crib. I just have to monitor the output and abort the script when I do get the right key.

Even with the extra processing for Amsco, the script runs fast. It can test every 7-digit key value in a minute or two, while it takes a couple hours on a 9-digit key. Currently, the script alternates testing “start with the Monome” with “start with the Dinone” for each key width. Of the ACA Amsco CONs I’ve tested the script with so far, most have had 7-digit keys, a couple have had 8, and only one had a 9-digit key. Unless I encounter a CON with a 10-digit key or greater, I’ll be happy with my script as it is. Also, “start with the Monome” is slightly more popular than “start with the Dinome” in the ACA CONs I’ve looked at.

Summary:
1) Amsco is a simple transposition cipher.
2) It is a variation on the Incomplete Columnar Transposition, and uses alternating 1-letter (monomes) and 2-letter pairs (dinomes) instead of just all single letters.
3) The first “cell” of the table can be a monome or a dinome, and then after that the rows and columns use alternating dinomes and monomes.
4) First, pick a numeric key and write it at the top of the table.
5) Then write the plaintext under the key in rows of alternating monomes and dinomes. The rows start with the opposite of the thing starting the row above it.
6) When the plaintext is fully written out, rearrange the columns in ascending key digit order, and read off the columns from left to right, top to bottom, to form the ciphertext.
7) Deciphering is a bit more tricky, and it helps to create the table as widths of 1 or 2 characters, and use that for filling in the table from the cipher message.
8) Cracking Amsco through brute force is similar to cracking ICT, but you need to encrypt the cell widths as well as the plaintext positions.
9) Because ACA CONs generally include a crib (a word that appears in the plaintext), it’s easy to test the results of the brute force approach. Print the output if it contains the crib word.
10) Most of the ACA Amsco CONs that I’ve looked at have had a key width of 7, but a couple have had widths of 6 and 9, and a few more have had widths of 8.
11) A VBScript running on a moderately fast PC can break an ACA Amsco CON with a width of 8 in under 10 minutes; and one with a width of 6 in a few seconds.

Wiggles, part 35


Just a little ongoing story to give you something to play with until the next blog post.

ZAPZFG, CBJM MU VYHLZ, D PJOLX ZVQDLU GZLL GSVG GSZBZ EVQ V SJLZ VG GSZ PZDLDYH JC GSZ PBZRDPZ DY CBJYG JC GSZ PJYPBZGZ. QJMZJYZ JB QJMZGSDYH SVX PSDQZLZX JB SVMMZBZX VY VPPZQQ FJDYG CBJM GSZ GJF QDXZ JC GSZ QZEZB. D EJBTZX VG QGVYXDYH WVPT OF, FOQSDYH MU SZVX GSBJOHS GSZ SJLZ VQ D QGBVDHSGZYZX MU LZHQ. D EVQ LOPTU – GSZ SJLZ EVQ WDHHZB GSVY GSZ VPPZQQ FJDYG VG GSZ PLOW, QJ D XDXY’G YZZX GJ XDQLJPVGZ MU QSJOLXZB VHVDY. D XOPTZX XJEY GJ BVDQZ MU VBM EDGS GSZ CLVQSLDHSG, VYX QGJJX WVPT OF. GSZBZ EVQ V CVDBLU LVBHZ FJPTZG VWJRZ GSZ QZEZB LDYZ, WDH ZYJOHS CJB MZ GJ FOLL MUQZLC OF DYGJ. GSZ ZXHZQ JC GSZ PBZRDPZ EVLLQ EZBZ EJBY XJEY V WDG, VYX QLDHSGLU QLDPTZB GSVY ZLQZESZBZ, DMFLUDYH GSVG ESVGZRZB OQZX GSDQ GOYYZL EJOLX PSDMYZU OF VYX XJEY GSZ EVLLQ GJ HZG CBJM GSZ GJF JC GSZ QZEZB LDYZ GJ GSZ WJGGJM JC GSZ PBZRDPZ EDGSJOG OQDYH V LVXXZB JB BJFZ. GSZ FJPTZG EVQ QVYX-LDYZX, EDGS V QPVGGZBDYH JC QVYX JY GSZ GJF JC GSZ PJYPBZGZ.

Gakken Otona no Kagaku kit, 180125


Well, after a full year since the last official announcement on the Gakken page, we finally get a new kit, and it turns out to be a reissue of the Pinhole Planetarium. Released under the new Best Selection line (numbered #1), the price is 3,000 yen, but the accompanying booklet is a measely 16 pages. I’m very, very disappointed.

I’m really hoping that this does not become a trend, with “new” kits coming out once a year as recycles. Sigh.

Thinking About Encryption, Part 38


Ragbaby is a weird name for a cipher. Not sure where it came from – there doesn’t seem to be a wiki page for it, and most of the hits on yahoo are for solver pages (I like my solver better). Ragbaby is a simple monoalphabetic substitution cipher that uses the position of the letter in the plaintext for obtaining the corresponding ciphertext letter from a keyed alphabet, in the ACA implementation.

Generally, Ragbaby uses a 24-letter alphabet, with I doubling up with J, and W doubling up with X. However, you can use 25- and 26-letter alphabets as well without impacting the algorithm.

The first step is to pick a keyword, and remove any subsequent duplicated letters, then fill in the remaining letters in order. If the keyword is “TEACHER”, the alphabet becomes:

TEACHRBDFGIKLMNOPQSUVWYZ (note that J and X are removed)

Second, we write out the plaintext, keeping the word breaks and punctuation. Over the text, we write the position numbers, starting with the word number at the beginning of each word. Hyphenated words are counted as one word.

................................11 1111 111 1 111 1111 111
12 234 34567 45 5678901. 678 7890123 89012 90 012 1234 234
in the world of fantasy, the fantasy world is the real one.

Now, the purpose of the position numbers here is to provide offsets from the plaintext letter within the keyed alphabet, to get the corresponding cipher letters. With “i” in the first word, we count one letter to the right of “i” in the keyed alphabet to get “K”. For “n”, two letters takes us to “P”. In the second word, two letters from “t” gets “A”, three from “h” gets “D”, and four letters from “e” gets “R”. If we reach the end of the alphabet, we just wrap around again to the left.

KP ADR TUISN UM MFWFKHG, BKG OIZIMBK RTOZU UH IOM PMOA CCO.

Deciphering the text just works in reverse. Write out the keyed alphabet. Write out the ciphertext, place the position numbers above that, start numbering at the beginning of each word. Then, take the first letter of the ciphertext (“K”), and count the position value (1) letters to the left to get “i”. Rinse and repeat.

................................11 1111 111 1 111 1111 111
12 234 34567 45 5678901. 678 7890123 89012 90 012 1234 234
KP ADR TUISN UM MFWFKHG, BKG OIZIMBK RTOZU UH IOM PMOA CCO.
in the world of fantasy, the fantasy world is the real one.

Breaking Ragbaby is surprisingly easy. First, there’s a fundamental weakness, in that the letter at position 24 encrypts to itself. Second, all of the cipher letters are at relative offsets from the corresponding plaintext letters, even though numbering increments at the beginning of each word. If we can place a “crib” (a word we’re told is in the plaintext), we can start nailing down the specific letters in the main alphabet.

Notice that we’ve kept the word spacings, so if we’re given a crib, the first step is to try counting its length and identifying which ciphertext words are the same length. Simultaneosly, we can eliminate some potential positions if we get 1-to-1 correspondences between the letters (this will only happen at position 24; otherwise, “H” = “h” will never occur normally).

Say the crib is “real” (to start with, just to show the concept). The only 4-letter ciphertext word is “PMOA”. Now, notice that we have “a” in “real” and “PMOA”. This is a good thing, so let’s claim that “a” is the first letter in our relative alphabet.

..........11111111112222
012345678901234567890123
a------------o----------

Above that “a” we have “O”, and a shift of 13. This is a “plaintext to ciphertext” conversion, so let’s count 13 letters to the right, and put the “o” there. Simultaneously, “A” decrypts to “l”, 14 letters to the left of “a.”

..........11111111112222
012345678901234567890123
a---------l--o----------

Now, one of the secrets in the ACA keying of alphabets like this is that if you’ve placed the key correctly, letters that are not part of the keyword will be in sequential order. Just by pure coincidence, there are two blanks between “l” and “o” in the keyed alphabet, and that allows for “lmno”. Let’s use that to decrypt part of the cipher message.

11 1111 111 1 111 1111 111
12 234 34567 45 5678901. 678 7890123 89012 90 012 1234 234
KP ADR TUISN UM MFWFKHG, BKG OIZIMBK RTOZU UH IOM PMOA CCO
.................................a.............he..eal...e


..........11111111112222
012345678901234567890123
a.h.......lmno.........e

Note that in placing “m” to position 11 in the alphabet, and “n” to position 12, I was also able to associate “e” to “M” for “real” and “PMOA”. I then guessed that “IOM” might be “the”, so I counted off 11 letters to the left of “o” to place “h” at position 2 in the alphabet.

This is about as far as I can go with what I have right now. It would have been better to use a longer crib from the start, but that would have made things way too easy for this example. So, let’s start over and say the crib is “fantasy,” and we’re not told that “fantasy” appears twice in the plaintext.

We’ve got two factors working for us. First, both “MFWFKHG” and “OIZIMBK” have repeating letters (in fact, “F” and “I” appear in the same locations in both words, which is a big hint). And, “a” repeats in “fantasy”. Picking “a” as our initial letter in the relative alphabet will make things easier in the long run (I hope).

a = 0. a = F shifted 6. a = K shifted 9.

.....................11.........1111...111..1 111 1111 111
12 234 34567 45 5678901. 678 7890123 89012 90 012 1234 234
KP ADR TUISN UM MFWFKHG, BKG OIZIMBK RTOZU UH IOM PMOA CCO
.................a..a


11111111112222
012345678901234567890123
a.....f..k

M = f shifted 5. F = t shifted 8.

.....................11.........1111...111..1 111 1111 111
12 234 34567 45 5678901. 678 7890123 89012 90 012 1234 234
KP ADR TUISN UM MFWFKHG, BKG OIZIMBK RTOZU UH IOM PMOA CCO
...t..........f.fa ta a


..........11111111112222
012345678901234567890123
a..f..k.m..........t

Looking at the alphabet, there are two spaces between “f” and “k”, for three letters (ghi), and one space between “k” and “m” for one letter (l). We can plug in “l” at position 10, but it doesn’t give us much (just confirmation that “PMOA” = “real”).

.....................11.........1111...111..1.111.1111.111
12 234 34567 45 5678901. 678 7890123 89012 90 012 1234 234
KP ADR TUISN UM MFWFKHG, BKG OIZIMBK RTOZU UH IOM PMOA CCO
...t..........f.fa.ta............a...................l


..........11111111112222
012345678901234567890123
a.....f..klm..........t

I’m now at the limits of my example. The ACA guidelines say that Ragbaby ciphers should be between 80 and 150 letters long, so my sentence doesn’t come close to qualifying. But, if we were to pursue this example further, we could make one of three assumptions. First, that the next letter after “m” will be “n”. That would let me place “n” = “W” shifted 7 in “fantasy.” If that doesn’t work out, it means that “n” is part of the keyword, and I’d have to try running it between “a” and “f.” Second, there are 10 spaces between “m” and “t”, and 12 letters missing. “t” is part of the key, so we just need to identify the one other letter in the key, and everything else will fit in sequential order. The last option is to see what would happen if we place the crib at “OIZIMBK,” because that’s still a possibility.

I’ll run with “OIZIMBK” first. K = y, shifted 13.

.....................11.........1111...111..1.111.1111.111
12 234 34567 45 5678901. 678 7890123 89012 90 012 1234 234
KP ADR TUISN UM MFWFKHG, BKG OIZIMBK RTOZU UH IOM PMOA CCO
i..t...w......f.fanta........fanta.y..o.l.....t.....al


..........11111111112222
012345678901234567890123
a.....f.iklmno.....wyzt

Filling in a couple extra letters, just to speed this up a bit, we have two 3-letter words that both start with “t”. Plus, “IOM” is covered by our alphabet, so let’s see what happens with that.

.....................11.........1111...111..1.111.1111.111
12 234 34567 45 5678901. 678 7890123 89012 90 012 1234 234
KP ADR TUISN UM MFWFKHG, BKG OIZIMBK RTOZU UH IOM PMOA CCO
i..t...w......f.fanta.....h..fanta.y..o.l.....the..eal...e


..........11111111112222
012345678901234567890123
a.h...f.iklmno.....wyzte

It’s obvious now that “g” can go between “f” and “i”. And that lets me check whether “BKG” is also “the”. I’ll add “H” = “s” shifted 10 while I’m at it. Which, incidentally, proves that “s” is at 17 in the alphabet, so that “u” and “v” can also go into positions 18 and 19.

.....................11.........1111...111..1.111.1111.111
12 234 34567 45 5678901. 678 7890123 89012 90 012 1234 234
KP ADR TUISN UM MFWFKHG, BKG OIZIMBK RTOZU UH IOM PMOA CCO
i..t...wo.l..of.fantasy...he.fanta.y..o.l...s.the..eal...e


..........11111111112222
012345678901234567890123
a.h...fgiklmno..suvwyzte

“TUISN” and “RTOZU” are starting to shape up to look like the same word, so taking the leap, w = R shifted 8.

.....................11.........1111...111..1.111.1111.111
12 234 34567 45 5678901. 678 7890123 89012 90 012 1234 234
KP ADR TUISN UM MFWFKHG, BKG OIZIMBK RTOZU UH IOM PMOA CCO
i..t.e.worl..of.fantasy...he.fanta.y.worl..is.the..eal...e


..........11111111112222
012345678901234567890123
a.hr..fgiklmno..suvwyzte

And now it’s all done but for the shouting. “worl_” is probably “world,” “pq” fits into the alphabet at positions 14 and 15, and that just leaves “b” and “c” to place. (“d” gets placed when I fill in “world”.)

.....................11.........1111...111..1.111.1111.111
12 234 34567 45 5678901. 678 7890123 89012 90 012 1234 234
KP ADR TUISN UM MFWFKHG, BKG OIZIMBK RTOZU UH IOM PMOA CCO
in the world of fantasy ..he fanta y world is the real ..e


..........11111111112222
012345678901234567890123
a hr dfgiklmnopqsuvwyzte

B = t shifted 6, therefore “c” must be position 1 in the alphabet.

.....................11.........1111...111..1.111.1111.111
12 234 34567 45 5678901. 678 7890123 89012 90 012 1234 234
KP ADR TUISN UM MFWFKHG, BKG OIZIMBK RTOZU UH IOM PMOA CCO
in the world of fantasy, the fantasy world is the real one


..........11111111112222
012345678901234567890123
achrbdfgiklmnopqsuvwyzte

Finally, we know that the keyword starts at the beginning of the alphabet, and that by extension “z” has to be the last letter of the alphabet in this instance, and therefore the keyword is “TEACHR” (teacher).

Summary:
1) Ragbaby is a simple monoalphabetic substitution cipher.
2) Letter positions in the plaintext are determined as the word number plus the location of that letter within the word.
3) Word breaks and punctuation are maintained.
4) A keyword is used to generate the cipher alphabet (the keyword, no redundant letters, followed by the remaining letters in sequence).
5) Traditionally, Ragbaby uses a 24-letter alphabet, with I/J and W/X doubled up, but the algorithm works well with 25-, 26- and 36- character alphabets.
6) Cipher letters are obtained by shifting the plaintext letter to the right for the count represented by that letter’s position number. (I.e. – if the alphabet is DUSTINABCEFGHKLMOPQRVWYZ, then “t” position 10 becomes “K”.)
7) Deciphering works the same way, but shifting is to the left instead of to the right.
8) Cracking a Ragbaby cipher consists of taking a crib (a hint word in the plaintext), placing it within the cipher message based on word length, not allowing letters to map to themselves (i.e. “h” != “H”).
9) Ragbaby has a weakness, in that letters with position 24 do map to themselves.
10) Once the crib is placed, create the relative key alphabet starting with the letter that appears the most in both the crib and the selected cipher word.
11) The word/letter positions for the cipher message now become relative shifts from the cipher letter left to the plaintext, or from the plaintext letter right to the matching cipher letter.
12) To place a letter in the alphabet, both the cipher and plaintext letters need to be known. That is, if “a” is in the alphabet, you’re working on the word “XBL”, the currently reconstructed word is “?nd”, and “X” is at word/letter position 5, then by assuming that “?nd” = “and”, the relationship “a” = “X” shifted left 5 will hold true.
13) Ragbaby can be hardened by using short messages, longer alphabets, and a completely randomized key alphabet. Disguising the space as another character also helps.
14) Ragbaby is simple enough that it’s fun to break by hand.