CSCI 341 Theory of Computation

Fall 2025, with Schmid
← 3.2 Turing Machines3.4 Reductions →

The Church-Turing Thesis

Last time, we saw Turing machines, our first honest-to-goodness full-scale mathematical model of computer hardware. But why did we define it the way we did? We had to make some choices about the design, right? And what's more, we seemed to go beyond what's normally represented by actual physical hardware: our definition allowed us to go beyond \(0\)s and \(1\)s and write arbitrary symbols to our memory. That's not good, right? Why can't we just use \(0\) and \(1\)?

Today, we are going to see that none of these choices matter in the end: no matter which "versions" of the Turing machine concept we choose, the same string transformers are computable. In fact, it is thought that any physically implementable model of computation that is at least as expressive as the Turing machine model is actually equivalent to the basic Turing machine model. This is what is known as the Church-Turing Thesis, which we will state more precisely at the end of today.

Today, we're going to make our way through a couple of different variations on the Turing machine theme, and show that they are all equivalent to the basic version of the Turing machine that we considered last time.

Binary Suffices

As our first example, let's argue that every computable function is representable by a Turing machine over the binary alphabet \(\{0, 1\}\).

(Binary Suffices) Let \(f \colon S_1 \to S_2\) be any function between sets \(S_1\) and \(S_2\). If \(f\) is Turing computable (in general, possibly with a Turing machine with many tape symbols), then there is a Turing machine \(\mathcal T = (Q, \{0,1\}, \delta)\) with a state \(x\) such that \(\mathcal T_x\) represents \(f\).
We are going to argue directly that for every Turing machine \(\mathcal T = (Q, A, \delta)\) with a state \(x\), there exists a Turing machine \(\mathcal T^\dagger = (Q^\dagger, \{0,1\}, \delta^\dagger)\) and a string representation \(\rho \colon A^* \to \{0,1\}^*\) that \((\rho, \mathcal T^\dagger_{x}, \rho)\) is a representation of \(\mathcal T_x\). To then see that this produces a representation of \(f\), we just need to compose the two representations:

First of all, let \(|A| = n\). Then there is a binary representation for each \(a \in A\) with \(\lfloor\log(n)\rfloor + 1\) many bits: specifically, if \(A = \{a_1, \dots, a_n\}\), then we can let \(\rho(a_i) = \mathsf{bin}(i-1)\) (possibly with some leading \(0\)s to make the length of \(\rho(a_i)\) equal to \(\ell\)). This allows us to define \(\rho \colon A^* \to \{0,1\}^*\) by \[ \rho(b_1b_2 \dots, b_m) = \rho(b_1) ~\rho(b_2) ~ \cdots ~\rho(b_m) \] For example, if \(A = \{a,b,c\}\), then \(\lfloor\log(n)\rfloor + 1 = 2\), and we have \[ \rho(a) = 00 \qquad \rho(b) = 01 \qquad \rho(c) = 10 \] so that \[ \rho(bbac) = 01~01~00~10 \] This takes care of the string representation.

Now we get to the hard part of defining \(\mathcal T^\dagger\). Fix \(\ell = \lfloor\log(n)\rfloor + 1\). The general idea is going to be to treat each block of \(\ell\) consecutive cells as one cell in the Turing machine \(\mathcal T\). This means that every move command has to move \(\ell\) spaces instead of \(1\), and every reading of the tape has to be replaced with \(\ell\) reading steps. Formally, the construction of \(Q^\dagger\) and \(\delta^\dagger\) happens in three stages:
  1. In the first stage, we replace every move right/left with \(\ell\) of the same move. That is, we replace \(\mathtt{move~right}\) with \(\mathtt{move~right}~\ell\), which we define to be \[ \mathtt{move~right}~\ell = \overbrace{\mathtt{move~right}{.}\mathtt{move~right} \dots \mathtt{move~right}}^{\ell \text{ times}} \] and similarly we replace every \(\mathtt{move~left}\) with \(\mathtt{move~left}~\ell\).
  2. In the second stage, we replace every \(\mathtt{write}~a\) with the binary representation of the written letter \(a\), and we replace \(\mathtt{erase}\) with \(\ell\) consecutive erase steps. Imn the former case, if \(\rho(a) = b_1b_2\cdots b_\ell\), we replace \(\mathtt{write}~a\) with \(\mathtt{write}~b_1b_2\dots b_\ell\), which we define to be \[\begin{aligned} &\mathtt{write}~b_1b_2\dots b_\ell \\ &= \mathtt{write}~b_1{.}\mathtt{move~right}{.}\mathtt{write}~b_2\dots \mathtt{move~right}{.}\mathtt{write}~b_\ell{.}\mathtt{move~left}~(\ell-1) \end{aligned}\] In the latter, \(\mathtt{erase}\) is replaced with \(\mathtt{write}~(\_)^\ell\), in the sense above.
  3. In the last step, for each state \(y \in Q\) and bitstring \(b_1b_2\dots b_\ell\), we add a path \[ y \xrightarrow{b_1 \mid \rhd} y_{b_1} \xrightarrow{b_2 \mid \rhd} \cdots y_{b_1b_2\dots b_{\ell-1}} \xrightarrow{b_{\ell-1} \mid \rhd} y_{b_1b_2\dots b_\ell} \] Then for each transition \(y \xrightarrow{a \mid p} z\), if \(\rho(a) = b_1\dots b_\ell\), then we replace that transition with the transition \[ y_{b_1\dots b_\ell} \xrightarrow{\lhd^\ell{.}p} z \] Notice that the transition is taken regardless of what is under the tape head. Above, we are using the shorthand \({\lhd^\ell} = \mathtt{move~left}~\ell\).
Let us now observe that \(\rho(\mathcal T_x(w)) = \mathcal T^\dagger_x(\rho(w))\) for any \(w \in A^*\) where \(\mathcal T_x(w)\) is well-defined.

Imagine running \(\mathcal T\) on a tape \((t, i)\) and \(\mathcal T^\dagger\) on a tape \((t', \ell i)\) at the same time, where for all \(k\), \[ t'(\ell k + j) = b_{j-1} \qquad\qquad \text{(*)} \] where \(\rho(t(k)) = b_1 \dots b_\ell\). Each transition \(y \xrightarrow{a \mid p} z\) corresponds to the \(\ell\) transitions \[ y \xrightarrow{b_1 \mid \rhd} y_{b_1} \xrightarrow{b_2 \mid \rhd} \cdots \xrightarrow{b_{\ell-1} \mid \rhd} y_{b_1b_2\dots b_\ell} \xrightarrow{\lhd^\ell{.}p'} z \] where \(p'\) is \(p\) but with the moves and writes adjusted as in stages 1 and 2 above. We just need to argue that, up to representation, the effect of \(p\) and \(p'\) on \((t, i)\) and \((t', \ell i)\) is the same.

More precisely, we are going to prove that for any tape program \(p\), \((t, i).p\) and \((t', \ell i){.}p'\) also satisfy the property stated in (*). To that end, we proceed by induction on \(p\).
  1. Base Case 1. If \(p = \mathtt{skip}\), then \((t, i){.}\mathtt{skip} = (t,i)\) and \((t', \ell i){.}\mathtt{skip} = (t',\ell i)\) and we are done.
  2. Base Case 2. If \(p = \mathtt{move~right}\) (equiv., \mathtt{left}), then \(p' = \mathtt{move~right}~\ell\), so that \((t, i).p = (t, i+1)\) and \((t', \ell i){.}p' = (t', \ell i + \ell) = (t', \ell (i + 1))\) and we are done.
  3. Base Case 3. If \(p = \mathtt{write}~a\), then \(p' = \mathtt{write}~b_1b_2\dots b_\ell\) where \(\rho(a) = b_1b_2\dots b_\ell\). In that case, \[ t'(\ell i + j) = b_{j-1} \] for each \(1 \le j \le \ell\), and the position of \((t', \ell i).\mathtt{write}~b_1\dots b_\ell\) is \(\ell i\), by design (see the definition of \(\mathtt{write}~b_1\dots b_\ell\) above).
  4. Induction Step. Finally, suppose that \(p = p_1{.}p_2\), and that the statement is true for \(p_1\) and \(p_2\). Then \(p' = p_1'{.}p_2'\), where \(p_1'\) and \(p_2'\) were obtained from \(p_1\) and \(p_2\) via stages 1 and 2 above. Using the induction hypothesis twice, we see that \((t, i){.}p_1\) and \((t', \ell i){.}p_1'\) satisfy (*), and therefore \((t, i){.}p_1{.}p_2\) and \((t', \ell i){.}p_1'{.}p_2'\) satisfy (*).
This finishes the proof.
(Simulating ABCs by Hand) Consider the Turing machine over the alphabet \(A = \{a,b,c\}\) below.
x = if blank goto y
if \(a\) write \(b\).move right.goto x
if \(b\) write \(c\).move right.goto x
if \(c\) write \(a\).move right.goto x
y = if blank move left.halt
  1. Draw a state diagram of this Turing machine.
  2. Use the construction in the proof of the Binary Suffices theorem to simulate a Turing machine over the alphabet \(\{0,1\}\) that computes the same string transformer.
  3. If \(\mathcal T\) is the Turing machine above, run \(\mathcal T_x(abc)\) and \(\mathcal T^\dagger_x(000110)\) in parallel to verify that \(\rho(\mathcal T(abc)) = \mathcal T_x^\dagger(000110)\).

One-sided Tapes Suffice

We chose to define our Turing machines to have 2-sided tapes, but we could just as well have restricted ourselves to a one-sided tape. Indeed, the two resulting models of comutation are equivalent!

(One-way Tape Machines) A one-way tape machine is a pair \((t,i)\) consisting of a function \(t \colon \mathbb N \to A \cup \{\_\}\) called the tape and a nonnegative (!) integer \(i\) called the position (of the tape head).

One-way tape machines also run \(\mathtt{Tape}\) programs, with one difference: the semantics of \(\mathtt{move~left}\) is different at position \(0\). \[ (t, 0).\mathtt{move~left} = (t, 0) \]

The notion of Turing machine for a one-way tape machine is exactly the same as for the two-way tape machine. Also, the definition of \(\mathcal T_x \colon A^* \rightharpoonup A^*\) for a given Turing machine \(\mathcal T\) with a state \(x\) (run on one-way tape machine) is the same.

For one-sided tapes, we generally fix a particular symbol, \(\blacksquare \in A\), called the end of the tape (or \(\mathtt{end}\)), and immediately write this symbol to the beginning of the tape. This leaves us with the following picture of one-sided tape machines: It is also generally accepted practice to erase this symbol at the end of execution.

(One-way tapes suffice) Let \(f \colon S_1 \to S_2\) be any function between sets \(S_1\) and \(S_2\). If \(f\) is Turing computable (with a Turing machine run on a two-way tape), then there is a Turing machine \(\mathcal T\) with a state \(x\) such that \(\mathcal T_x\) represents \(f\) (where \(\mathcal T_x\) is evaluated on a one-way tape machine).
Again, this is a matter of simulating one type of tape machine with another. We are going to simulate a two-way tape machine \((t, i)\) with a one-way tape machine by inteleaving the positive indexed cells of \(t\) and the negative indexed cells of \(t\). In particular, if \(t\) is given by \[ t = \qquad ~\dots \mid -3 \mid -2 \mid -1 \mid 0 \mid 1 \mid 2 \mid 3 \mid \cdots~ \] then we are going to squeeze this into the tape \[ t' = \qquad [~\blacksquare ~\mid ~0 \mid -1 \mid 1 \mid -2 \mid 2 \mid -3 \mid 3 \mid \cdots~ \] Formally, given a two-way tape \((t, i)\), we transform it into the one-way tape \((t', i')\) defined as follows: the tape \(t'\) is given by \[ \begin{gathered} t'(j) &= \begin{cases} \blacksquare & j = 0\\ t(i) & -2i = j \text{ and } i < 0 \\ t(i) & 2i + 1 = j \text{ and } i \ge 0 \end{cases} \\ \end{gathered} \qquad \text{(*)} \] and the position \(i'\) by \[ \begin{aligned} i' &= -2|i| &\text{if \(i < 0\)} \\ i' &= 2i + 1 &\text{if \(i \ge 0\)} \end{aligned} \qquad \text{(*)} \] where \(j\) ranges over all \(j \in \mathbb N\). We are going to construct a Turing machine \(\mathcal T^\dagger = (Q^\dagger, A \cup \{\blacksquare\}, \delta^\dagger)\) such that \(\mathcal T_x = \mathcal T_x^\dagger\) for every \(x \in Q\) and corresponding state in \(Q^\dagger\).

Without loss of generality, we can assume that for any transition \(x \xrightarrow{a \mid p} y\) in the original \(\mathcal T\), \(p\) is a basic tape program (one of \(\mathtt{skip},\mathtt{move~left/right},\mathtt{write}~\sigma\)) (simply stretch out any composite tape programs into multiple transitions). Here is the general idea behind the construction of \(\mathcal T^\dagger\): we are going to make two copies of \(Q\), one where "the original position \(i\) is nonnegative (\(\ge 0\))", and one where "the original position is negative", and switch back and forth when the original tape head changes between negative and positive. We define \[ Q^\dagger = (Q \times \{+\}) \cup (Q \times \{-\}) \cup Q^\spadesuit \] where \(Q^\spadesuit\) is a set of states we will mention momentarily, with a distinguished state \(z^\spadesuit \in Q^\spadesuit\). The construction of \(\delta^\dagger\) now proceeds as follows:
  1. For any transition \(x \xrightarrow{a \mid p} y\), if \(p = \mathtt{move~right}\), then we add the transitions \[ (x,+) \xrightarrow{a \mid \rhd.\rhd} (y,+) \qquad (x,-) \xrightarrow{a \mid \lhd.\lhd} (y,-) \] and similarly, if \(p = \mathtt{move~left}\), then we add the transitions \[ (x,+) \xrightarrow{a \mid \lhd.\lhd} (y,+) \qquad (x,-) \xrightarrow{a \mid \rhd.\rhd} (y,-) \] These model the movement of the original tape head by squeezing the negative indices into the even indices and positive indices into the odd indices: if you want to move the tape head once in the original machine, you need to move twice in this one. If \(p \in \{\mathtt{skip},\mathtt{write}~\sigma\}\), then we add both \[ (x,+) \xrightarrow{a \mid p} (y,+) \qquad (x,-) \xrightarrow{a \mid p} (y,-) \] since neither move the tape head.
  2. For every state \(x\), we add the transitions \[ (x, +) \xrightarrow{\blacksquare \mid \rhd.\rhd} (x,-) \qquad (x, -) \xrightarrow{\blacksquare \mid \rhd} (x,+) \] This implements the switching between positive and negative.
  3. For any \(\sigma \in \{\_\} \cup A\) such that \(x\) immediately halts on input \(\sigma\), we add the transitions \((x, +) \xrightarrow{\sigma} z^\spadesuit\) and \((x, -) \xrightarrow{\sigma} z^\spadesuit\).
Now let us describe the program \(z^\spadesuit\). This program shifts the odd ("nonnegative index") symbols on the one-way tape over, making room for the even ("negative index") spaces. The purpose of this is to make sure that the order in which the symbols appears on the output of running \(\mathcal T^\dagger_{(x, +)}(w)\) is the same order as \(\mathcal T_x(w)\). Let us start with an example: suppose that we had the following tape \[ t = \qquad \cdots \mid \_ \mid a_{-2} \mid a_{-1} \mid a_0 \mid a_1 \mid a_2 \mid a_3 \mid \_ \mid \cdots \] This has been squeeze this into the tape \[ t' = \qquad [~\blacksquare \mid a_0 \mid a_{-1} \mid a_1 \mid a_{-2} \mid a_2 \mid \_ \mid a_3 \mid \_ \mid \_ \mid \cdots \] The program \(z^\spadesuit\) now acts on \(t'\) as follows:
  1. First, it tests whether there is a non-blank symbol on an even index greater than 0. If this is not the case, then it erases \(\blacksquare\). On this tape, \(a_{-1}\) is at index \(2\), so it does not erase \(\blacksquare\) and halt: \[ t' = \qquad [~\blacksquare \mid a_0 \mid {\color{blue} a_{-1}} \mid a_1 \mid a_{-2} \mid a_2 \mid \_ \mid a_3 \mid \_ \mid \_ \mid \cdots \]
  2. Otherwise, it moves the tape head back to index \(1\) (using \(\blacksquare\) as a landmark), and shifts all odd indexed symbols over by two cells. In the example, this produces \[ [~\blacksquare \mid \_ \mid a_{-1} \mid a_0 \mid a_{-2} \mid a_1 \mid \_ \mid a_2 \mid \_ \mid a_3 \mid \cdots \]
  3. It then moves the item at index 2 left by one. In the example, this produces \[ [~\blacksquare \mid a_{-1} \mid \_ \mid a_0 \mid a_{-2} \mid a_1 \mid \_ \mid a_2 \mid \_ \mid a_3 \mid \cdots \]
  4. Finally, it moves all even indexed symbols with index greater than 2 to the left two squares, and then returns to step 1. In the example, this produces \[ [~\blacksquare \mid a_{-1} \mid a_{-2} \mid a_0 \mid \_ \mid a_1 \mid \_ \mid a_2 \mid \_ \mid a_3 \mid \cdots \]
Running the example above longer, we would have \[\begin{array}{c | c | c | c | c | c | c | c | c | c | c | c | c | r} [~\blacksquare & a_{-1} & {\color{blue}a_{-2}} & a_0 & \_ & a_1 & \_ & a_2 & \_ & a_3 & \_ & \_ & \cdots & \text{step 1}\\ [~\blacksquare & \_ & a_{-2} & {\color{purple} a_{-1}} & \_ & {\color{purple} a_0} & \_ & {\color{purple} a_1} & \_ & {\color{purple} a_2} & \_ & {\color{purple} a_3} & \cdots & \text{step 2} \\ [~\blacksquare & {\color{purple} a_{-2}} & \_ & a_{-1} & \_ & a_0 & \_ & a_1 & \_ & a_2 & \_ & a_3 & \cdots & \text{step 3} \\ [~\blacksquare & a_{-2} & \_ & a_{-1} & \_ & a_0 & \_ & a_1 & \_ & a_2 & \_ & a_3 & \cdots & \text{step 4 (nothing happens)} \\ [~\_ & a_{-2} & \_ & a_{-1} & \_ & a_0 & \_ & a_1 & \_ & a_2 & \_ & a_3 & \cdots & \text{step 1 halts} \\ \end{array}\] The set \(Q^\spadesuit\) is whatever number of states are necessary to implement the program \(z^\spadesuit\).

To see that the construction we have just proposed is correct, imagine running \(\mathcal T\) on a tape \((t, i)\) and \(\mathcal T^\dagger\) on the tape \((t', i')\) defined above in parallel. Every movement of the two-way tape head by \(\mathcal T\) is matched by a movement of the one-way tape head by \(\mathcal T^\dagger\) such that the two properties labelled (*) above are preserved. This in particular means that \(\mathcal T\) writes to index \(i\) at the same time that \(\mathcal T^\dagger\) writes to \(i'\), where \(i'\) is as it was defined in (*) above. This implies that if a state \(x\) in \(\mathcal T\) halts on input \(w\) with resulting tape given by \[ \cdots ~\mid a_{-m} \mid \cdots \mid a_{-1} \mid a_0 \mid a_1 \mid \cdots \mid a_n \mid \cdots \] then \((x, +)\) arrives at \(z^\spadesuit\) with the state of the tape machine being \[ [~\blacksquare ~\mid a_0 \mid a_{-1} \mid a_1 \mid \cdots a_{-m} \mid \cdots \mid a_n \mid \cdots \] Then running \(z^\spadesuit\) on the latter tape results in the tape \[ [~\_ ~\mid a_{-m} \mid \_ \mid \cdots \mid \_ \mid a_{-1} \mid \_ \mid a_0 \mid \_ \mid a_1 \mid \_ \mid \cdots \mid \_ \mid a_n \mid \cdots \] Thus, if \(\mathcal T_x(w) = u\), then \(\mathcal T_{(x,+)}^\dagger(w) = u\).

One Tape Machine Suffices

Here is another idea: what if we wanted to have two tape machines going instead of one?

(Multi-tape Machines) Given a natural number \(n \in \mathbb N\), an (two-way) \(n\)-tape machine is a tuple of \(n\) tape machines \(\langle (t_1, i_1), \dots, (t_n, i_n)\rangle\). The set \(\mathtt{Tape}_n\) of \(n\)-tape programs is derived from the grammar \[ E \to \mathtt{skip} \mid \mathtt{move}~k~\mathtt{left} \mid \mathtt{move}~k~\mathtt{right} \mid \mathtt{write}~\sigma~\mathtt{to}~k \mid E{.}E \] where \(1 \le k \le n\) and \(\sigma \in A\). The semantics of \(n\)-tape programs is as follows: Given an \(n\)-tape machine \(\langle (t_1, i_1), \dots, (t_n, i_n)\rangle\), we define \[\begin{aligned} \langle (t_1, i_1), \dots, (t_n, i_n)\rangle.\mathtt{skip} &= \langle (t_1, i_1), \dots, (t_n, i_n)\rangle \\ \langle (t_1, i_1), \dots, (t_n, i_n)\rangle.\mathtt{write}~\sigma~\mathtt{to}~k &= \langle (t_1, i_1), \dots, (t_k,i_k).\mathtt{write}~\sigma, \dots, (t_n, i_n)\rangle \\ \langle (t_1, i_1), \dots, (t_n, i_n)\rangle.\mathtt{move}~k~\mathtt{left} &= \langle (t_1, i_1), \dots, (t_k, i_k - 1), \dots, (t_n, i_n)\rangle \\ \langle (t_1, i_1), \dots, (t_n, i_n)\rangle.\mathtt{move}~k~\mathtt{right} &= \langle (t_1, i_1), \dots, (t_k, i_k + 1), \dots, (t_n, i_n)\rangle \end{aligned}\]

Again, the corresponding notion of Turing machine is the same, with the one exception that the transition relation of an \(n\)-tape Turing machine \(\mathcal T = (Q, A, \delta)\) is now \[ \delta \subseteq Q \times (A \cup \{\_\})^n \times \mathtt{Tape}_n \times Q \] Given that \((x, (\sigma_1, \sigma_2, \dots, \sigma_n), p, y) \in \delta\), we write \[ x \xrightarrow{\sigma_1, \sigma_2, \dots, \sigma_n \mid p} y \] The tuple \(\sigma_1,\dots,\sigma_n\) in the transition denotes \[ \mathtt{if}~(\sigma_1~\mathtt{on~tape~}1~\mathtt{and}~\cdots~\mathtt{and}~\sigma_n~\mathtt{on~tape~}n) \] Lastly, we take the string transformer \(\mathcal T_x \colon A^* \rightharpoonup A^*\) defined by the state \(x\) of the \(n\)-tape Turing machine \(\mathcal T\) to be determined by inputting our string to the first indexed tape and reading the result from the first indexed tape after a halting run.

(One Tape Suffices) Let \(f \colon S_1 \to S_2\) be any function between sets \(S_1\) and \(S_2\). If \(f\) is Turing computable on an \(n\)-tape Turing machine, then there is a single tape Turing machine \(\mathcal T\) with a state \(x\) such that \(\mathcal T_x\) represents \(f\).
It actually suffices to show that a \(2\)-tape Turing machine can be simulated with a \(1\)-tape (vanilla) Turing machine. We sketch the construction here and let you, the diligent reader, fill in the details. Similar to the two-way to one-way construction, the idea is to squeeze two tapes into one: given a pair of tape machines \(((t_1, i_1), (t_2, i_2))\), \[t_1 = \begin{array}{ l | c | c | c | c | r} \hline &\cdots & {\color{blue} a_1} & {\color{blue} a_2} & {\color{blue} a_3} & {\color{blue} a_4} & \cdots \\ \hline \end{array}\] \[t_2 = \begin{array}{ l | c | c | c | c | r} \hline \cdots & b_1 & b_2 & b_3 & b_4 & \cdots \\ \hline \end{array}\] we are going to squeeze them into \[t' =\begin{array}{l | c | c | c | c | c | c | c | c | c | r} \hline \cdots & {\color{blue} a_1} & b_1 & {\color{blue} a_2} & b_2 & {\color{blue} a_3} & b_3 & {\color{blue} a_4} & b_4 & \cdots \\ \hline \end{array}\] The only tricky part is that we only have one tape head now! This can be fixed as follows: we need to add more sybmols. Let \(A\) be the original alphabet of tape symbols, and define \[ A^\bullet = A \cup \{\sigma^\framebox{1} \mid \sigma \in \{\_\} \cup A\} \cup \{\sigma^\framebox{2} \mid \sigma \in \{\_\} \cup A\} \] This triples the alphabet by adding in marked letters \(a^\framebox{1}\) and \(a^{\framebox{2}}\) for every letter \(a \in A\). A marked letter \(a^\framebox{1}\) is going to tell us "where the first tape head is", and a marked letter \(a^\framebox{2}\) is going to tell us "where the second tape head is". We therefore represent \(\langle (t_1,i_i), (t_2, i_2)\rangle\) on a single tape \((t', i')\) as \[ t'(j) = \begin{cases} t_1(k) & j = 2k \text{ and } i_1 \neq k\\ t_1(k)^\framebox{1} & j = 2k \text{ and } i_1 = k\\ t_2(k) & j = 2k + 1 \text{ and } i_1 \neq k\\ t_2(k)^\framebox{2} & j = 2k + 1 \text{ and } i_1 = k \end{cases} \qquad i' = 0 \] The choice of \(i'\) is somewhat arbitrary; the point is that the tape head will move to whichever marked letter is mimicking a tape head command at a given moment, wherever it is needed. It is also worth noting that one could get away with only one marker, since the first tape appears in even cells and the second in odd cells (thus allowing the machine to tell the difference).

We can now define our Turing machine \(\mathcal T^\dagger = (Q^\dagger, A^\bullet, \delta^\dagger)\). We can assume without loss of generality that in the original Turing machine \(\mathcal T\), for any transition \(x \xrightarrow{a_1, a_2 \mid p}\), \(p\) is a basic \(n\)-tape program. The construction of \(\mathcal T^\dagger\) now looks like this:
  1. We add two programs, \(z_{\mathtt{sc1}}\) and \(z_{\mathtt{sc2}}\), that do the following: \(z_{\mathtt{sc1}}\) scans the cells of the tape (from the left-most empty to the right-most) and immediately halts on any marked symbol \(\sigma^\framebox{1}\). Similarly for \(z_{\mathtt{sc2}}\). We call these \(\mathtt{scan1}\) and \(\mathtt{scan2}\).
  2. For each transition \(x \xrightarrow{\sigma_1,\sigma_2 \mid \mathtt{move}~1~\mathtt{right}} y\) in \(\mathcal T\), we append the following program to \(\mathcal T^\dagger\):
    x = scan1.goto x\(_{\sigma_1}\)
    x\(_{\sigma_1}\) = if \(\sigma_1^\framebox{1}\) scan2.goto x\(_{(\sigma_1,\sigma_2)}\)
    x\(_{(\sigma_1,\sigma_2)}\) = if \(\sigma_2^\framebox{2}\) scan1.write \(\sigma_1\).move right \(2\).goto x\(_{(\sigma_1,\sigma_2)}\)
    if \(\tau\) (i.e., unmarked) write \(\tau^\framebox{1}\).goto y
    We would append similar programs for transitions of the form \(x \xrightarrow{\sigma_1,\sigma_2 \mid \mathtt{move}~2~\mathtt{right}} y\), as well as for move-left commands, in \(\mathcal T\).
  3. For a transition \(x \xrightarrow{\sigma_1,\sigma_2 \mid \mathtt{write}~\tau~\mathtt{to}~1} y\), we append
    x = scan1.goto x\(_{\sigma_1}\)
    x\(_{\sigma_1}\) = if \(\sigma_1^\framebox{1}\) scan2.goto x\(_{(\sigma_1,\sigma_2)}\)
    x\(_{(\sigma_1,\sigma_2)}\) = if \(\sigma_2^\framebox{2}\) scan1.write \(\tau^\framebox{1}\).goto y
    A similar program is appended for writing to the second tape.
  4. Finally, since the output tape of \(\mathcal T\) is just the first tape, we append to \(\mathcal T\) a program that erases all odd-indexed cells, as well as changes every marked symbol \(\sigma^\framebox{1}\) to its unmarked companion \(\sigma\).
The gist of why this works is simply that we have squeezed two tapes into one tape and successfully kept track of both tape heads. This ensures that \(\mathcal T\) writes to its first tape precisely when \(\mathcal T^\dagger\) writes to an even indexed cell. Since the final step in \(\mathcal T^\dagger\) is to erase the odd-indexed cells, \(\mathcal T_x(w) = \mathcal T_x^\dagger(w)\) for all \(w\) recognized by \(x\) in \(\mathcal T\).
(Two-to-One) Simulate the following \(2\)-tape Turing machine with a \(1\)-tape Turing machine. Above, the \(*\) means "any symbol". If the Turing machine above is \(\mathcal T\), run \(\mathcal T(111,0000)\) in parallel with \(\mathcal T^\dagger(101010\_0)\) (note that \(\_\) is not officially a letter in \(A\), but we use it here for the sake of specifying a tape).
(All the tapes!) In the proof above, we only showed that \(2\)-tape Turing machines could be simulated with ordinary Turing machines. How might we prove that an arbitrary \(n\)-tape Turing machine can be simulated by an ordinary Turing machine?
(With a Counter) Let's consider a variant of the Turing machine that implements an additional counter. Its basic machine programs are generated by the grammar \[ E \to \mathtt{skip} \mid \mathtt{move~left} \mid \mathtt{move~right} \mid \mathtt{write}~\sigma \mid \mathtt{count~up} \mid \mathtt{count~dn} \] and its states have the additional feature of being able to check whether the value of the counter is a particular value. That is, you can write \[ \mathtt{if}~\sigma~\mathtt{and}~\mathtt{count}~5~(\text{followed by commands}) \] The output of a Turing machine that operates with a counter is contained in the (single) tape, as usual. Show that every string transformer that can be represented by a Turing machine with a counter is Turing computable in the ordinary sense.
Show that these can be simulated with \(2\)-tape Turing machines.
(Computing in Flatland) A (corner-)sheet machine \((s, (i, j))\) consists of a function \(s \colon \mathbb N\times \mathbb N \to \{\_\} \cup A\) called a (corner-)sheet, and a pair of coordinates \((i,j) \in \mathbb N \times \mathbb N\) called the position (of the sheet head).
Artist's depiction of a sheet machine in its natural habitat.
Sheet machine programs are generated by the grammar \[ E \to \mathtt{skip} \mid \mathtt{move~left} \mid \mathtt{move~right} \mid \mathtt{move~up} \mid \mathtt{move~down} \mid \mathtt{write}~\sigma \] The output of a Turing machine that operates on sheets is contained in the bottom row of the machine. Show that every string transformer that can be represented by a Turing machine that operates on sheets is Turing computable in the ordinary sense.
Use a second tape to keep track of the \(x\)-index of the sheet head. Here is a picture that might help.

The Church-Turing Thesis

There are dozens more variations on the Turing machine theme that we could discuss here. You might even cover one in your project. But the point is, they all seem to be beat by the vanilla Turing machine concept.

After all of that, you might be left wondering: can we beat the single-tape Turing machine?. This is a good question as it is, but it also could use a bit more specificity:

Is there an algorithm that is, ignoring finite differences, implementable on a physical device and cannot be represented on a Turing machine?
This is the central scientific question of theoretical computer science. The word scientific is used here in a very literal sense: it is not a mathematical question, because there is no mathematical definition of "ignoring finite differences, implementable on a physical device"---this is a question of practical measure. The Church-Turing thesis is, and it is still widely believed to be correct, answers the question with a simple, "no."
The Church-Turing Thesis: Every algorithm that is implementable on a physical device can be represented on a Turing machine.
Indeed, every reasonable model of computation that has been discovered in the past century that is at least as powerful as the Turing machine has been proven to be equivalent to the Turing machine, including \(\lambda\)-terms.

(Church-Turing) Let \(f \colon S_1 \to S_2\) be any function between sets \(S_1, S_2\). Then \(f\) is \(\lambda\)-representable if and only if it is Turing computable.
← 3.2 Turing Machines3.4 Reductions →
Top