Timing Turing Machines
In the previous part of the course, we discovered the Turing machine, a powerful model of computation that mimics the bit-writing-and-rewriting operations of modern hardware. It was a mathematical model of computation, which allowed us to give an "upper-bound" on the power of computing as a whole: it showed us that there are computational problems that cannot be solved algorithmically.
But what about the problems that can be solved algorithmically? We are now ready for Part 4 of this story, which has less to do with whether or not a given computational problem can be solved at all, and more to do with how many resources it takes to solve the problem. This is what's known as the complexity of the problem. Studying complexity with Turing machines precisely requires precise definitions of time and space as resources.
Runtime
Our model for the memory of a computer was the tape machine, which we defined to be a pair \((t, i)\) consisting of a tape \(t \colon \mathbb Z \to A \cup \{\_\}\) and a position \(i \in \mathbb Z\). Tape machines came with a notion of tape program, and we defined these with a grammar that allowed us to write sequences of the basic commands \(\mathtt{move~right}\), \(\mathtt{move~left}\), and \(\mathtt{write~\sigma}\), where \(\sigma \in A\). Now, it is a bit hard to say how much time each of these basic commands takes to run, given they are abstract entities. In particular, we don't know the units of time to measure with. Our fundamental assumption going forward is that each of these basic tape machine commands takes about the same amount of time to run. This allows us to use the amount of time that a basic tape machine command to run as our unit of time. This reduces measuring the time taken by a particular tape program to simply counting the number of basic tape machine commands get run.
- \(\mathsf{time}(\mathtt{skip}) = 0\)
- \(\mathsf{time}(\mathtt{write~\sigma}) = \mathsf{time}(\mathtt{move~left}) = \mathsf{time}(\mathtt{move~right}) = 1\)
- If \(p,q \in \mathtt{Tape}\), then \(\mathsf{time}(p{.}q) = \mathsf{time}(p) + \mathsf{time}(q)\)
The definition of running time for tape programs extends pretty readily to timing Turing machine programs, with the minute exception that we now have to deal with the possibility that a particular Turing machine program does not halt.
Later, we will define the runtime of a Turing program where the Turing machine is potentially nondeterministic. But for now, we assume determinism.
For the sake of a snappier notation, we will often write \(\mathsf{time}(x)(w)\) instead of \(\mathsf{time}(\mathcal T, x)(w)\) if \(\mathcal T\) is understood from context.
- \(\mathsf{time}(x_{clr})(\varepsilon)\)
- \(\mathsf{time}(x_{clr})(0)\)
- \(\mathsf{time}(x_{clr})(010)\)
Now, usually when we are measuring the runtime of an algorithm, we are concerned with one of three measures of runtime complexity: worst-case, best-case, and average-case (there is also amortized analysis, but this is a more complex issue for Turing machines) runtimes on inputs of a given size. Measuring size-of-input is straightforward for Turing programs; it's the length of the input word. Best-case and average-case analyses also make sense for Turing programs, but we are going to focus on worst-case analysis for this course for the sake of simply illustrating how analyzing Turing programs is generally done. The worst-case runtime of a Turing program is simply the maximum runtime over all inputs of a given size.
It is not uncommon to restrict the possible inputs to a particular set of strings when measuring complexity. This comes up when discussing the "difficulty" (complexity) of a problem relative to its representation.
- \(\mathsf{maxtime}_{x_{bad}}(0)\)
- \(\mathsf{maxtime}_{x_{bad}}(2)\)
- \(\mathsf{maxtime}_{x_{bad}}(4)\)
- \(\mathsf{maxtime}_{x_{bad}}^I(0)\)
- \(\mathsf{maxtime}_{x_{bad}}^I(2)\)
- \(\mathsf{maxtime}_{x_{bad}}^I(4)\)
If \(x\) halts on all inputs, then the "\(\cup \{\infty\}\)" is unnecessary in the definition above.
Runtime Complexity
The worst-case runtime of a Turing program \(x\), as long as it halts on all inputs, is therefore a function on the natural numbers, \[ \mathsf{maxtime}_x \colon \mathbb N \to \mathbb N \] This allows us to apply tools for asymptotic comparisons between worst-case runtimes, which you have already seen in past courses. We include the relevant definitions below anyhow, for reference.
- We write \(f \in \mathcal O(g)\) and say that \(f\) \(g\) grows as fast as \(f\) if there is a \(c > 0\) such that for all \(n \in \mathbb N\), \(f(n) < c ~ g(n)\).
- We write \(g \in \Omega(f)\) if \(f \in \mathcal O(g)\)
- We write \(f \approx g\) and say that \(f\) and \(g\) grow at the same rate if \(f \in \mathcal O(g)\) and \(f \in \Omega(g)\)
- We write \(f \in o(g)\) and say that \(f\) \(g\) grows strictly faster than \(f\) if \(f \in \mathcal O(g)\) and \(f \not\approx g\).
- We write \(g \in \omega(f)\) if \(f \in o(g)\)
Typical asymptotic comparisons include things like \(f \in \mathcal O(n^2)\) (parabolic growth), \(f \in \mathcal O(n^3)\) (cubic growth), and \(f \in \mathcal O(b^n)\) for some \(b > 1\) (exponential growth). One of the foundational observations made in the theory of computational growth is that the latter growth should be held distinct from the previous two.
- \(f\) exhibits polynomial growth if there exists a \(k \in \mathbb N\) such that \(f \in \mathcal O(n^k)\)
- \(f\) exhibits \(b\)-exponential growth if for the given \(b \in \mathbb N\), \(f \in \mathcal O(b^n)\)
- where \(f \colon \mathbb N\to \mathbb N\), \(x\) runs in \(\mathcal O(f)\)-time (relative to \(I\)) if \(\mathsf{maxtime}_x \in \mathcal O(f)\)
- \(x\) runs in polynomial time (relative to \(I\)) if \(\mathsf{maxtime}_x^I\) exhibits polynomial growth
- \(x\) runs in exponential time (relative to \(I\)) if \(\mathsf{maxtime}_x^I\) exhibits \(b\)-exponential growth for some \(b > 1\)
- Let \(f \in \mathcal O(n^3)\). Check that \(f \circ p \in \mathcal O(n^6)\).
- Extrapolating from the previous exercise, if \(q(x)\) is any polynomial at all and \(f\) exhibits polynomial growth, does \(p \circ f\) exhibit polynomial growth?
- Use Desmos graphing calculator to graph \(2^{n^2}\) and \(3^n\) as functions of \(n\). Is \(f \circ p \in \mathcal O(3^n)\)? Up to observation, does \(f \circ p\) exhibit exponential growth?
Asymptotic growth classes allow us to organize the space of all computational problems by the growth rates of the runtimes of their solutions. One can see this as a measure of their complexity, or simply said, how difficult they are to solve. Today we are going to look at two typical runtime complexities.
Tractability
We start with a definition of "tractable", which intuitively means "can be solved in a reasonable amount of time".
- \(f \colon \{0,1\}^* \to \{0,1\}^*\) where \(f(\varepsilon) = \varepsilon\), \(f(0w) = 1~f(w)\), and \(f(1w) = 0~f(w)\) (\(f\) computes the \(2\)'s complement of bitstrings). Note that in this case, you don't need a representation.
- \(f \colon \{0,1\}^* \to \{0,1\}^*\) where \(f(w) = 0^n1^m\), where \(n\) is the number of \(0\)s in \(w\) and \(m\) is the number of \(1\)s in \(w\)
- \(D \subseteq \mathbb N\) where \(D = \{2n \mid n \in \mathbb N\}\)
Spacial Complexity
We should make a brief note here about spacial complexity, since it plays a big role in more advanced ideas that you are bound to encounter in your career at some point.
- \(N_{right} =\ \) the maximum index \(j\) such that the tape head reaches \((t', j)\) while it runs \(x\) on input \(w\)
- \(N_{left} =\ \) the minimum index \(j\) such that the tape head reaches \((t', j)\) while it runs \(x\) on input \(w\)
Spacial requirement is another measure of the amount of resources required by a program as it runs that is worth keeping track of. We can similarly define the asymptotic space requirements of a program, as well as the spacial complexity of a program as it runs.
- runs in \(O(f)\)-space (relative to \(I\)) if \(\mathsf{maxspace}_x^I \in \mathcal O(f)\).
- runs in polynomial space (relative to \(I\)) if \(\mathsf{maxspace}_x^I \in \mathcal O(n^k)\) for some \(k \in \mathbb N\).
- runs in exponential space (relative to \(I\)) if \(\mathsf{maxspace}_x^I \in \mathcal O(b^n)\) for some \(b > 1\).