As I’ve left the code standing so far, even the most simple Turing machine that actually does something useful is going to be assigned really big numbers, meaning running from 0 to the first working machine is going to take a long time. To get around that, Penrose suggests trimming some of the fat from the algorithms first.
Consider xn_times_2.txt:
00->00R
01->11R
10->00R
11->101L
100->00STOP
101->110R
110->101R
111->111R
As mentioned before, we strip out the numbers to the left of the arrow, and just rebuild that up in the Turing machine when we load the program. This is why the state numbers are in binary, and are paired with the 1s and 0s we read from the tape. It does mean, though, that we can’t have any missing state+bit read combinations. If there is a state+bit read combination not used in the algorithm, represent it with the dummy line “00R”. Also, strip out the arrows. And, don’t use commas to separate each line. Instead, we use “L”, “R” and “STOP” to mark the end of each statement.
00R
11R
00R
101L
00STOP
110R
101R
111R
Next, we can treat state “0” and read bit “0” (“00”) as “”. And treat state “0” and read bit “1” (“01”) as “1”. Also, the very first line, 00R is required in every algorithm to shift past leading 0 padding before reaching the start of data. We can cut the first line, and just automatically force it in the boot loader when we load the algorithm into the machine. That gives us:
11R
R
101L
STOP
110R
101R
111R
Notice that we tend to have more Rs than Ls, so in the binary coding, list R first.
0 = 0
1 = 10
R = 110
L = 1110
STOP = 11110
Substituting the expanded binary values;
1010110
110
100101110
11110
10100110
10010110
101010110
Finally, notice that R, L and STOP all end with the same last 3 digits. So, no matter what the last character is at the end of the last line, the last three digits will be 110. We can trim that off the end of the string, and append it again in the loader.
1010110
110
100101110
11110
10100110
10010110
101010
Make this all one line:
1010110110100101110111101010011010010110101010
The decimal value is now 47731979167146, which is a LOT smaller than the original 6.0109e16 we started with. This entire process can easily be automated, essentially as an “assembly language” compiler that creates “machine code.” The boot loader in the universal Turing machine would just reverse the process (see below).
Technically, Penrose indicates that there’s no need for an “end of program” marker, but I don’t see any way around it (see below). I’m using EOP = 111110. With this in place then, the binary representation of the algorithm and the data can easily coexist in the same textfile. Just read in everything to the boot loader as one long string, interpret the algorithm, and auto launch it when you’re done.
As one last note, Roger talks about the first few Turing machines you get for the numbers 0 to 13. Remembering the interpretation steps,
0 = 00->00R
1 = 00->00R, 01->01R
2 = 00->00R, 01->10R
…
These are not particularly well-formed, but they will run (and never, ever stop…)
Ok, changes to my VBScript. I know this is going to look ugly, and that I should add more comments to explain what’s going on, but WordPress screws up all the formatting anyway, and if anyone does have questions, you can always ask in the comments.
The contents of “universal.txt” are:
1010110110100101110111101010011010010110101010 ‘ compressed xn_times_2
111110 ‘ EOP
00000101010000101100000 ‘ The data: 113, plus 0 padding
First, in the main body of the script, I added an algoRaw() array to hold the full reconstructed algorithm, line-per-line. Then I make a call to readFile(“universal.txt”) (universal.txt contains the binary-encoded version of xn_times_2 plus the data). This loads algoRaw() with the same content as you can find in xn_times_2.txt in a previous blog entry. readFile() also returns the data portion of the binary-encoded string, which I save to tapeReel. I changed readAlgorithm() to procAlgorithm(), and stripped out the file handling code from that. Otherwise, the main body doesn’t change at all.
——
dim algo(100, 4), algoRaw(100)
algoMax = 0
algoRawMax = 1
tapeReel = readFile(“universal.txt”)
dummy = procAlgorithm()
——
procAlgorithm() originally opened up the desired file, read the algorithm, and parsed it into the algo() array. Since I’ve moved the file handling part into readFile(), all I had to change in this function was in getting the algorithm statements from the algoRaw() array. Everything else is the same.
Function procAlgorithm()
while(algoMax < algoRawMax)
str = algoRaw(algoMax)
algo(algoMax, 0) = left(str, instr(str, “-“) – 1)
tempStr = right(str, len(str) – instr(str, “-“))
algo(algoMax, 3) = 0 ‘ Default to Stop
if(instr(tempStr, “STOP”) > 0) then
algo(algoMax, 1) = left(tempStr, len(tempStr) – 5) ‘ New state
algo(algoMax, 2) = mid(tempStr, len(tempStr) – 4, 1) ‘ Write bit
else
algo(algoMax, 1) = left(tempStr, len(tempStr) – 2) ‘ New state
algo(algoMax, 2) = mid(tempStr, len(tempStr) – 1, 1) ‘ Write bit
tempDir = right(tempStr, 1) ‘ Move direction
if(tempDir = “L”) then
algo(algoMax, 3) = -1 ‘ Move left
elseif(tempDir = “R”) then
algo(algoMax, 3) = 1 ‘ Move right
else wscript.echo “Bad Move Head Direction: ” & tempDir
end if
end if
algoMax = algoMax + 1
wend
procAlgorithm = true
end Function
Ok, this is the new part, and is really ugly. I know this. But it works. Plus, it works in reconstructing the xn_times_2 algorithm exactly as described above. If you have difficulty figuring out what’s going on, reread the section on making the algorithm smaller by removing the fat. In readFile(), I’m just putting all the fat back in from where it came.
Function readFile(rAFN)
set fso = CreateObject(“Scripting.FileSystemObject”)
fIn = “D:\penrose\” & rAFN
wscript.echo “Reading file: ” & fIn
Set inData = fso.OpenTextFile(fIn, ForReading, True)
rFStr = “”
rFtemp = “”
rFBuild = “”
while(not inData.AtEndOfStream)
rFStr = rFStr & inData.readline
wend
inData.close
algoRaw(0) =”00-00R”
notEOP = true
while(notEOP)
rFChr = left(rFStr, 1)
rFStr = right(rFStr, len(rFStr) – 1)
rFTemp = rFTemp & rFChr
if(rFChr = “0”) then
if(strcomp(rFTemp, “111110”) = 0) then
notEOP = false
rFTemp = left(rFTemp, len(rFTemp) -6) & “110”
end if
rFToken = “”
rfEOL = false
if(strcomp(rFTemp, “0”) = 0) then
rFtoken = “0”
elseif(strcomp(rFTemp, “10”) = 0) then
rFtoken = “1”
elseif(strcomp(rFTemp, “110”) = 0) then
rFtoken = “R”
rFEOL = true
elseif(strcomp(rFTemp, “1110”) = 0) then
rFtoken = “L”
rFEOL = true
elseif(strcomp(rFTemp, “11110”) = 0) then
rFtoken = “STOP”
rFEOL = true
else
notEOP = false
wscript.echo “Problem with token : |” & rFTemp & “|”
end if
rFTemp = “”
rFBuild = rFBuild & rFToken
if(rFEOL) then
if((rFToken = “R” OR rFToken = “L”) and len(rFBuild) < 3) then
rFBuild = right(“000” & rFBuild, 3)
elseif(strComp(rFToken, “STOP”) = 0 and len(rFBuild) < 6) then
rFBuild = right(“000” & rFBuild, 6)
end if
rFNo = cstr(DecimalToBinary(algoRawMax))
if(len(rFNo) < 2) then
rFNo = “0” & rFNo
end if
algoRaw(algoRawMax) = rFNo & “-” & rFBuild
rFBuild = “”
algoRawMax = algoRawMax + 1
end if
end if
wend
readFile = rFStr
end Function
Finally, to convert the decimal line counter into the binary combination of “state” + “read bit” part that appears at the left of the “->” arrow, I needed the other half of the converter I found at the National Instruments forums.
Function DecimalToBinary(DecimalNum)
Dim tmp
Dim n
n = DecimalNum
tmp = Trim(cStr(n Mod 2))
n = n \ 2
Do While n <> 0
tmp = Trim(cStr(n Mod 2)) & tmp
n = n \ 2
Loop
DecimalToBinary = tmp
End Function
As a recap, universal.txt contains:
101011011010010111011110101001101001011010101011111000000010101000010110000
The reconstructed algorithm is:
00-00R
01-11R
10-00R
11-101L
100-00STOP
101-110R
110-101R
111-111R
And the program output for multiplying a number by 2 is:
Starting With 00000010101000010110000 = (1110001,)
Decimal = 113
Ping: Answer is: 000000101010000100110 (00000011100010,)
Decimal = 226
And, that’s a wrap for Turing Machines.
—
Actually, the thing about the end of program (EOP) marker. In compressing the algorithms down, we cut out the current state + bit read data to the left of the arrow, and the arrow itself. Then, we eliminate the first line, which is always 0->00R. Plus, we cut the leading zeroes for new state + write byte, like so:
0->00R
1->10R
10->11R
11->00STOP
Becomes
10R
11R
STOP
Represented as: 100110101011011110
Note that the lines always end with R, L or STOP, and they always start with a 1.
So, technically, if the algorithm is followed by 0s before we get to the data, we could simply say that if an R, L or STOP (110, 1110, 11110) is followed by another 0, that we’ve hit the end of the program and are now getting into the data.
Unfortunately, in order to shave off three more bits, we cut 110 off the right end, giving us:
100110101011011 + 000…
If we look at the lines
11R (1010110)
11L (10101110)
00STOP (11110)
Slicing 110 from the right of 11R and 11L , and appending data gives us:
1010 + 0000… = 1100000
10101 + 0000… = 1110000
Both of these could be mistaken for really long New State values, and the boot loader wouldn’t know where to stop loading.
Slicing 110 from STOP, plus data, gives
11 + 000000 = R00000
Assuming that R followed by a 0 is illegal, we COULD say, “this is the end of the program,” tack 110 back on to the 11 to get 11110000, and we’d be fine. But, this trick only works if STOP is the very last command in the algorithm, and that’s actually kind of rare.
It’s just easier to treat 111110 = EOP as an “algorithm/data” separator, and not include it when calculating the Turing machine number.