CSCI 341 Theory of Computation

Fall 2025, with Schmid

← 3.8 Recognizability And Enumerability 4.1 Polynomial Reductions →

Timing Turing Machines

In the previous part of the course, we discovered the Turing machine, a powerful model of computation that mimics the bit-writing-and-rewriting operations of modern hardware. It was a mathematical model of computation, which allowed us to give an "upper-bound" on the power of computing as a whole: it showed us that there are computational problems that cannot be solved algorithmically.

But what about the problems that can be solved algorithmically? We are now ready for Part 4 of this story, which has less to do with whether or not a given computational problem can be solved at all, and more to do with how many resources it takes to solve the problem. This is what's known as the complexity of the problem. Studying complexity with Turing machines precisely requires precise definitions of time and space as resources.

Runtime

Our model for the memory of a computer was the tape machine, which we defined to be a pair \((t, i)\) consisting of a tape \(t \colon \mathbb Z \to A \cup \{\_\}\) and a position \(i \in \mathbb Z\). Tape machines came with a notion of tape program, and we defined these with a grammar that allowed us to write sequences of the basic commands \(\mathtt{move~right}\), \(\mathtt{move~left}\), and \(\mathtt{write~\sigma}\), where \(\sigma \in A\). Now, it is a bit hard to say how much time each of these basic commands takes to run, given they are abstract entities. In particular, we don't know the units of time to measure with. Our fundamental assumption going forward is that each of these basic tape machine commands takes about the same amount of time to run. This allows us to use the amount of time that a basic tape machine command to run as our unit of time. This reduces measuring the time taken by a particular tape program to simply counting the number of basic tape machine commands get run.

(Tape Program Runtime) Let \(p \in \mathtt{Tape}\) be a tape machine program. Then the runtime of \(p\), written \(\mathsf{time}(p)\), is the number of basic tape machine commands that appear in \(p\). More formally,

\(\mathsf{time}(\mathtt{skip}) = 0\)
\(\mathsf{time}(\mathtt{write~\sigma}) = \mathsf{time}(\mathtt{move~left}) = \mathsf{time}(\mathtt{move~right}) = 1\)
If \(p,q \in \mathtt{Tape}\), then \(\mathsf{time}(p{.}q) = \mathsf{time}(p) + \mathsf{time}(q)\)

The definition of running time for tape programs extends pretty readily to timing Turing machine programs, with the minute exception that we now have to deal with the possibility that a particular Turing machine program does not halt.

(Turing Program Runtime) Let \(\mathcal T = (Q, A, \delta)\) be a Turing machine. The runtime of a given a path through the Turing machine \[ x_0 \xrightarrow{a_1 \mid p_1} x_1 \xrightarrow{a_2 \mid p_2} \cdots \xrightarrow{a_n \mid p_n} x_n \qquad\qquad (*) \] is the nonnegative integer \(\mathsf{time}(p_1{.}p_2\dots p_n)\). If the path (*) is a halting run in \(\mathcal T\) starting from \(x\) on an input word \(w\), then we say that the runtime of (*) is completed. If it is a run but is not halting, then we say that it is partial.

If we additionally assume that \(\mathcal T\) is deterministic and \(x\) halts on input \(w\), then we define \[ \mathsf{time}(\mathcal T, x)(w) = \mathsf{time}(p_1{.}p_2 \dots p_n) \] if (*) is a halting run, and say that \(\mathsf{time}(\mathcal T, x)(w)\) is the runtime of \(x\) on input \(w\).

Later, we will define the runtime of a Turing program where the Turing machine is potentially nondeterministic. But for now, we assume determinism.

For the sake of a snappier notation, we will often write \(\mathsf{time}(x)(w)\) instead of \(\mathsf{time}(\mathcal T, x)(w)\) if \(\mathcal T\) is understood from context.

(Counting Steps) Determine the runtimes of the following Turing program, \(x_{clr}\) in the figure below, on each input.

\(\mathsf{time}(x_{clr})(\varepsilon)\)
\(\mathsf{time}(x_{clr})(0)\)
\(\mathsf{time}(x_{clr})(010)\)

Now, usually when we are measuring the runtime of an algorithm, we are concerned with one of three measures of runtime complexity: worst-case, best-case, and average-case (there is also amortized analysis, but this is a more complex issue for Turing machines) runtimes on inputs of a given size. Measuring size-of-input is straightforward for Turing programs; it's the length of the input word. Best-case and average-case analyses also make sense for Turing programs, but we are going to focus on worst-case analysis for this course for the sake of simply illustrating how analyzing Turing programs is generally done. The worst-case runtime of a Turing program is simply the maximum runtime over all inputs of a given size.

(Worst-case Runtime) Let \(\mathcal T = (Q, A, \delta)\) be a Turing machine, let \(x \in Q\), and let \(n \in \mathbb N\). The worst-case runtime of \(x\) on inputs of length \(n\), if it exists, is the number \[ \mathsf{maxtime}_x(n) = \max \{ \mathsf{time}(x)(w) \mid w \in A^* \text{ where } \mathsf{len}(w) \le n\} \] if it exists, and we write \(\mathsf{maxrt}_x(n) = \infty\) if \(x\) runs forever on some input \(w\) with \(\mathsf{len}(w) \le n\).

It is not uncommon to restrict the possible inputs to a particular set of strings when measuring complexity. This comes up when discussing the "difficulty" (complexity) of a problem relative to its representation.

(Relative Worst-case Runtime) Let \(\mathcal T = (Q, A, \delta)\) be a Turing machine, let \(x \in Q\), and let \(n \in \mathbb N\). Let \(I \subseteq A^*\) be a language of possible inputs. The relative worst-case runtime of \(x\) on inputs of length \(n\) in \(I\), if it exists, is the number \[ \mathsf{maxtime}_x^I(n) = \max \{ \mathsf{time}(x)(w) \mid w \in R \text{ where } \mathsf{len}(w) \le n\} \] and we write \(\mathsf{maxtime}_x^I(n) = \infty\) if \(x\) runs forever on some input \(w \in I\) with \(\mathsf{len}(w) \le n\).

(Counting worst-case Steps) Determine the worst-case runtimes of the following Turing program, \(x_{bad}\) in the figure below (it's not the same as the one above!).

\(\mathsf{maxtime}_{x_{bad}}(0)\)
\(\mathsf{maxtime}_{x_{bad}}(2)\)
\(\mathsf{maxtime}_{x_{bad}}(4)\)

Now consider the language of possible inputs \(I = \mathcal L(1^*)\). Determine the relative worst-case runtimes of \(x_{bad}\) on inputs in \(I\).

\(\mathsf{maxtime}_{x_{bad}}^I(0)\)
\(\mathsf{maxtime}_{x_{bad}}^I(2)\)
\(\mathsf{maxtime}_{x_{bad}}^I(4)\)

The best-case runtime is defined similarly to the worst-case runtime, except with a \(\min\) taken instead of a \(\max\). What are the best-case runtimes on the inputs above?

If \(x\) halts on all inputs, then the "\(\cup \{\infty\}\)" is unnecessary in the definition above.

Runtime Complexity

The worst-case runtime of a Turing program \(x\), as long as it halts on all inputs, is therefore a function on the natural numbers, \[ \mathsf{maxtime}_x \colon \mathbb N \to \mathbb N \] This allows us to apply tools for asymptotic comparisons between worst-case runtimes, which you have already seen in past courses. We include the relevant definitions below anyhow, for reference.

(Asymptotic Comparison) Let \(f, g \colon \mathbb N \to \mathbb N\) be any two functions from the natural numbers to the natural numbers.

We write \(f \in \mathcal O(g)\) and say that \(f\) \(g\) grows as fast as \(f\) if there is a \(c > 0\) such that for all \(n \in \mathbb N\), \(f(n) < c ~ g(n)\).
We write \(g \in \Omega(f)\) if \(f \in \mathcal O(g)\)
We write \(f \approx g\) and say that \(f\) and \(g\) grow at the same rate if \(f \in \mathcal O(g)\) and \(f \in \Omega(g)\)

Strict comparisons are also useful at times.

We write \(f \in o(g)\) and say that \(f\) \(g\) grows strictly faster than \(f\) if \(f \in \mathcal O(g)\) and \(f \not\approx g\).
We write \(g \in \omega(f)\) if \(f \in o(g)\)

Typical asymptotic comparisons include things like \(f \in \mathcal O(n^2)\) (parabolic growth), \(f \in \mathcal O(n^3)\) (cubic growth), and \(f \in \mathcal O(b^n)\) for some \(b > 1\) (exponential growth). One of the foundational observations made in the theory of computational growth is that the latter growth should be held distinct from the previous two.

(Polynomial and Exponential Time) Let \(f \colon \mathbb N \to \mathbb N\). We say that

\(f\) exhibits polynomial growth if there exists a \(k \in \mathbb N\) such that \(f \in \mathcal O(n^k)\)
\(f\) exhibits \(b\)-exponential growth if for the given \(b \in \mathbb N\), \(f \in \mathcal O(b^n)\)

Given a Turing machine \(\mathcal T = (Q, A, \delta)\) with a state \(x \in Q\), and given \(I \subseteq A^*\), we say that

where \(f \colon \mathbb N\to \mathbb N\), \(x\) runs in \(\mathcal O(f)\)-time (relative to \(I\)) if \(\mathsf{maxtime}_x \in \mathcal O(f)\)
\(x\) runs in polynomial time (relative to \(I\)) if \(\mathsf{maxtime}_x^I\) exhibits polynomial growth
\(x\) runs in exponential time (relative to \(I\)) if \(\mathsf{maxtime}_x^I\) exhibits \(b\)-exponential growth for some \(b > 1\)

(Polynomial Effects) Let \(p(x) = c_0 + c_1 x + c_2 x^2\) be a quadratic polynomial.

Let \(f \in \mathcal O(n^3)\). Check that \(f \circ p \in \mathcal O(n^6)\).
Extrapolating from the previous exercise, if \(q(x)\) is any polynomial at all and \(f\) exhibits polynomial growth, does \(p \circ f\) exhibit polynomial growth?
Use Desmos graphing calculator to graph \(2^{n^2}\) and \(3^n\) as functions of \(n\). Is \(f \circ p \in \mathcal O(3^n)\)? Up to observation, does \(f \circ p\) exhibit exponential growth?

Asymptotic growth classes allow us to organize the space of all computational problems by the growth rates of the runtimes of their solutions. One can see this as a measure of their complexity, or simply said, how difficult they are to solve. Today we are going to look at two typical runtime complexities.

Tractability

We start with a definition of "tractable", which intuitively means "can be solved in a reasonable amount of time".

(Tractibility)

Given a decision problem \(D \subseteq S\) for a set \(S\), we say that \(D\) is tractable if it has a faithful representation \((\rho, L)\) such that \(L\) has a decision procedure that runs in polynomial time (relative to \(\rho(D) = \{w \in A^* \mid \exists d \in D, w = \rho(d)\}\)).

Let \(f \colon S_1 \to S_2\) be a function between sets \(S_1\) and \(S_2\). The problem of computing \(f\) is tractable if there is a computable representation \((\rho_1, \mathcal T_x, \rho_2)\) of \(f\), where the state \(x\) of the Turing machine \(\mathcal T\) runs in polynomial time (relative to \(\rho_1(S_1)\)).

The family of tractable problems is written \(\mathsf{PTIME}\), or \(\mathsf{P}\) for short, where the P stands for "polynomial".

("I could do that") Show that the following problems are tractable.

\(f \colon \{0,1\}^* \to \{0,1\}^*\) where \(f(\varepsilon) = \varepsilon\), \(f(0w) = 1~f(w)\), and \(f(1w) = 0~f(w)\) (\(f\) computes the \(2\)'s complement of bitstrings). Note that in this case, you don't need a representation.
\(f \colon \{0,1\}^* \to \{0,1\}^*\) where \(f(w) = 0^n1^m\), where \(n\) is the number of \(0\)s in \(w\) and \(m\) is the number of \(1\)s in \(w\)
\(D \subseteq \mathbb N\) where \(D = \{2n \mid n \in \mathbb N\}\)

(Composing Tractable Problems) Let \(f \colon S_1 \to S_2\) and \(g \colon S_2 \to S_3\) be functions between sets \(S_1,S_2,S_3\). Prove that if the problems of computing \(f\) and \(g\) are both tractible, then the problem of computing \(g \circ f\) is tractible.

Spacial Complexity

We should make a brief note here about spacial complexity, since it plays a big role in more advanced ideas that you are bound to encounter in your career at some point.

(Spacial Requirements) Let \(\mathcal T = (Q, A, \delta)\) be a Turing machine with a state \(x \in Q\). Let \(w = a_0 \cdots a_{n-1} \in A^*\) be a word, and consider the tape \((t, 0)\) with \(t(i) = a_i\) for \(0 \le i \le n-1\). The space required to run \(x\) on input \(w\) is \(N_{right} - N_{left}\), where

\(N_{right} =\ \) the maximum index \(j\) such that the tape head reaches \((t', j)\) while it runs \(x\) on input \(w\)
\(N_{left} =\ \) the minimum index \(j\) such that the tape head reaches \((t', j)\) while it runs \(x\) on input \(w\)

We write \[ \mathsf{space}(\mathcal T, x)(w) = N_{right} - N_{left} \]

Spacial requirement is another measure of the amount of resources required by a program as it runs that is worth keeping track of. We can similarly define the asymptotic space requirements of a program, as well as the spacial complexity of a program as it runs.

(Spacial Complexity) Let \(\mathcal T = (Q, A, \delta)\) be a Turing machine with a state \(x \in Q\), and let \(n \in \mathbb N\) and \(I \subseteq A^*\). The (relative) worst-case space requirements of \(x\) on inputs of length \(n\) (in \(I\)) is defined to be \[ \mathsf{maxspace}_x^I(n) = \max \{\mathsf{space}(x)(w) \mid w \in I \text{ and } \mathsf{len}(w) \le n \text{ and } w \in I\} \] We say that \(x\)

runs in \(O(f)\)-space (relative to \(I\)) if \(\mathsf{maxspace}_x^I \in \mathcal O(f)\).
runs in polynomial space (relative to \(I\)) if \(\mathsf{maxspace}_x^I \in \mathcal O(n^k)\) for some \(k \in \mathbb N\).
runs in exponential space (relative to \(I\)) if \(\mathsf{maxspace}_x^I \in \mathcal O(b^n)\) for some \(b > 1\).

(Be Your Own Boss) Formulate a sensible definition of "polynomial space problem" and "exponential space problem", both for functions and for decision problems.

← 3.8 Recognizability And Enumerability 4.1 Polynomial Reductions →

Top