The Halting Problem
Let's start with a practical scenario.
You work for some big tech company, [redacted], that offers a service that allows users to upload and run their own software in the cloud.
The way that this works is that after a user uploads some code to the server, they can edit it, compile, and allow their own clients to run it.
When a client requests to run a user's program, that program gets put on a queue somewhere and waits for its turn to complete whatever task it has been assigned.
This is a wonderfully successful service, and something like 60% of the internet makes its way onto the servers!
This is great and all, but it turns out that the average user is not a great programmer: their programs are often buggy, and the worst are the ones that loop forever, i.e., they never halt.
This is bad because it effectively means that nothing else on the queue after it gets to run!
Your boss, [redacted], doesn't like this, and isn't pleased.
So he asks you, his best programmer, to fix it.
One solution is of course to only let each program run for a certain amount of time.
There are several issues with this, but suffice it to say that certain very important bits of software need to be allowed more time than others to get their tasks completed, and it's hard to know in advance how much time exactly that is.
So your boss, [redacted], wants you to be smarter about it (the irony is not lost on you), and write some program that takes a block of code as input and decides whether or not that block of code compiles to a program that halts.
(The Halting Problem)
The Halting problem \(L_{Halt}\) is the language of all string representations of Turing machine programs that halt on a given input.
Formally,
\[
L_{Halt} = \left\{ \lfloor \mathcal T \rfloor \mathtt{*} x \mathtt{*} w ~\Big|~ \begin{gathered}
\text{\(\mathcal T\) is a Turing machine with state \(x\), }\\
\text{\(w \in A^*\), and \(x\) halts on input \(w\)}\\
\end{gathered}\right\}
\]
is the set of all strings \(\lfloor \mathcal T\rfloor\mathtt{*x*}w\) such that \(w \in A^*\), \(\mathtt{x}\) is the string representation a state of \(\mathcal T\), and \(x\) halts on input \(w\).
(Understanding Halting)
Consider the encoding of a one-state Turing machine below,
\[
\lfloor \mathcal T\rfloor = \mathtt{state\#s1{/}if\#0{:}move\#right{.}write\#0{.}goto\#s1}
\]
- Draw the Turing machine \(\mathcal T\).
- Is \(\lfloor\mathcal T\rfloor \mathtt{*s1{*}0} \in L_{Halt} \)?
- Is \(\lfloor\mathcal T\rfloor \mathtt{*s1{*}1} \in L_{Halt} \)?
- Is \(\lfloor\mathcal T\rfloor \mathtt{*s1{*}} \in L_{Halt} \)?
- Is \(\lfloor\mathcal T\rfloor \mathtt{*s2{*}0} \in L_{Halt} \)?
Effectively, what [redacted] is asking you to do is to design a decision procedure for \(L_{Halt}\).
Unfortunately for your boss, this task is not possible.
Undecidability of the Halting Problem
Recall that we write \(\mathsf{Dec}\) for the family of all languages \(L \subseteq A^*\) such that there is a Turing machine \(\mathcal T\) with a state \(x\) such that \(x\) decides \(L\), i.e.,
\[
L = \{w \in A^* \mid \mathcal T_x(w) = 1\}
\quad \text{and} \quad
A^*\setminus L = \{w \in A^* \mid \mathcal T_x(w) = 0\}
\]
Then \(L_{Halt} \notin \mathsf{Dec}\).
In other words, the halting problem is undecidable.
(Halting is Undecidable)
The Halting problem is undecidable, i.e., \(L_{Halt} \notin \mathsf{Dec}\).
The proof is by contradiction: we are going to begin by assuming that \(L_{Halt}\) is decidable, and from that assumption conclude something absurd, like \(1 = 0\).
If there is any proof in this class that you should pay attention to, it is this one!
Assume for a contradiction that \(L_{Halt} \in \mathsf{Dec}\).
Then there is a Turing machine \(\mathcal H\) with a state \(h\) such that \(\mathcal H_{h}(w) = 1\) if \(w \in L_{Halt}\) and \(\mathcal H_{h}(w) = 0\) if \(w \notin L_{Halt}\).
We are going to use \(h\) as a subroutine in a different program that exhibits an impossible behaviour!
Let \(\mathtt{double\_input}\) be a Turing machine program that doubles the input string on the tape, i.e., turns the tape containing \(w\) into the tape containing \(ww\).
This can be implemented in one of several ways.
(Implement a Doubler)
Design an explicit Turing machine \(\mathcal T\) with a state \(\mathtt{double\_input}\) such that \(\mathcal T_{\mathtt{double\_input}}(w) = ww\) for any word \(w \in \{0,1\}^*\).
Let us write \(\mathtt{det\_if\_halt}\) for the Turing program \(h\) in \(\mathcal H\), to make our construction below clearer.
Now consider the following Turing machine, which we will call \(\mathcal D\).
| diag = |
double_input.det_if_halt.goto liar
|
| liar = |
if 0 : halt
if 1 : goto loop
|
| loop = |
move right.move left.goto loop
|
In the code for \(\mathcal D\) above, \(\mathtt{det\_if\_halt}\) decides whether the input string is of the form \(\lfloor \mathcal T\rfloor \mathtt{* x *} w\) for some Turing machine \(\mathcal T\) with state \(x\) that halts on input \(w\).
Now consider the input word
\[
w = \lfloor \mathcal D\rfloor \mathtt{* {diag} *}
\]
Does \(\mathtt{diag}\) halt on input \(w\)?
Let's check!
We can split this into two cases.
Case 1:
If \(\mathtt{diag}\) halts on input \(w\), then the output of running \(\mathtt{double\_input{.}det\_if\_halt}\) on the input word \(w\) must be \(0\), or else the \(\mathtt{liar}\) state enters the diverging \(\mathtt{loop}\).
Therefore, we must have
\[
0
= \mathcal H_{h}(ww)
= \mathcal H_{h}(
{\color{blue}\lfloor \mathcal D \rfloor} \mathtt{* {\color{red} {diag}} *} {\color{purple}\lfloor \mathcal D \rfloor \mathtt{* {diag} *}}
)
\]
But this word is a representation of a Turing machine (blue), state (red), and input (purple)!
Namely,
\[
{\color{blue}\lfloor \mathcal D \rfloor} \mathtt{* {\color{red} {diag}} *} {\color{purple}\lfloor \mathcal D \rfloor \mathtt{* {diag} *}}
= {\color{blue}\lfloor \mathcal D \rfloor} \mathtt{* {\color{red} {diag}} *} {\color{purple} w }
\]
By definition of \(\mathtt{det\_if\_halt}\), \(\mathcal H_{h}({\color{blue}\lfloor \mathcal D \rfloor} \mathtt{* {\color{red} {diag}} *} {\color{purple} w }) = 0\) implies that \(\mathtt{diag}\) never halts on \({\color{purple} w } = {\color{purple}\lfloor \mathcal D \rfloor \mathtt{* {diag} *}}\).
This contradicts the assumption at the beginning of this case.
(Needing Duplication)
Try this part of the proof again, but without running \(\mathtt{double\_input}\) before \(\mathtt{det\_if\_halt}\).
What goes wrong?
Case 2:
If \(\mathtt{diag}\) does not halt on input \(\lfloor \mathcal D \rfloor \mathtt{* {diag} *}\), then the output of running \(\mathtt{double\_input{.}det\_if\_halt}\) on the input word \(w\) must be \(1\).
This is because both \(\mathtt{double\_input}\) and \(\mathtt{det\_if\_halt}\) halt on all inputs, so the only path down which \(\mathtt{diag}\) loops forever is the one where the \(\mathtt{liar}\) state enters the diverging \(\mathtt{loop}\).
In this case, we have
\[
1
= \mathcal H_{h}(ww)
= \mathcal H_{h}(
{\color{blue}\lfloor \mathcal D \rfloor} \mathtt{* {\color{red} {diag}} *} {\color{purple}\lfloor \mathcal D \rfloor \mathtt{* {diag} *}}
)
\]
Again, this word is a representation of a Turing machine (blue), state (red), and input (purple),
\[
{\color{blue}\lfloor \mathcal D \rfloor} \mathtt{* {\color{red} {diag}} *} {\color{purple}\lfloor \mathcal D \rfloor \mathtt{* {diag} *}} = {\color{blue}\lfloor \mathcal D \rfloor} \mathtt{* {\color{red} {diag}} *} {\color{purple} w }
\]
By definition of \(\mathtt{det\_if\_halt}\), \(\mathcal H_{h}({\color{blue}\lfloor \mathcal D \rfloor} \mathtt{* {\color{red} {diag}} *} {\color{purple} w }) = 1\) implies that \(\mathtt{diag}\) halts on input \({\color{purple} w } = {\color{purple}\lfloor \mathcal D \rfloor \mathtt{* {diag} *}}\).
This contradicts the assumption at the beginning of this case.
...wait, but then we have shown that \(\mathtt{diag}\) both halts and also does not halt on input \(w\)?
Since this is not possible, our initial assumption that \(\mathcal H\) and \(h\) (i.e., \(\mathtt{det\_if\_halt}\)) exist must have been false.
Therefore, \(L_{Halt}\) is not decidable.
Formally, \(L_{Halt} \notin \mathsf{Dec}\).
(Arguing)
With your team, comb through the proof above and make sure everybody understands each step.