Lecture 7

In-class notes: CS 505 Spring 2025 Lecture 7

Diagonalization

Suppose we are given complexity classes $C_{1}$ and $C_{2}$ . How can we show they are different? That is, show $C_{1} \neq = C_{2}$ .

We’ve seen the technique of diagonalization before when we showed undecidability of certain languages. Diagonalization is a general technique that gives us one way of showing the above result: differentiating between complexity classes. Intuitively, if we are given $L_{1} \in C_{1}$ and $L_{2} \in C_{2}$ , diagonalization allows us to differentiate between $C_{1}$ and $C_{2}$ as follows.

If $M_{1}$ decides $L_{1}$ , then we want to say $M_{1}$ is different from any decider $M_{2}$ for $L_{2}$ .
We do this by arguing for any $x$ , if $M_{2} (x) = 1$ then $M_{1} (x) = 0$ , and vice versa.

Origins of Diagonalization

Diagonalization was originally introduced by Georg Cantor. He used diagonalization to prove that $∣ N ∣ < ∣ R ∣$ . That is, the set of all natural numbers is strictly smaller than the set of all real numbers. This result at the time was not well received: these are both infinite sets, how could you possibly reason about them being different sizes?

This proof first relies on defining when two infinite-sized sets are the same size. Briefly, two sets of infinite size $S_{1}, S_{2}$ are said to have the same size $∣ S_{1} ∣ = ∣ S_{2} ∣$ if there exists a bijection $f$ from $S_{1}$ to $S_{2}$ . That is, for every $s_{1} \in S_{1}$ , there is a unique $s_{2} \in S_{2}$ such that $f (s_{1}) = s_{2}$ .

For example, we know that $∣ N ∣ = ∣ Z ∣$ via the following bijection. $0 \mapsto 01 \mapsto 12 \mapsto - 13 \mapsto 24 \mapsto - 2 \dots f (i) = {- ⌈ i /2 ⌉ ⌈ i /2 ⌉ i is even i is odd .$

Under this definition of set equality, Cantor showed that $∣ N ∣ < ∣ R ∣$ . We’ll do an easier proof by showing $N$ is smaller than the entire interval of real numbers $[0, 1]$ .

Theorem. $∣ N ∣ < ∣ [0, 1] ∣$ .

Proof. We do a proof by contradiction. Suppose that $∣ N ∣ = ∣ [0, 1] ∣$ . This means there is a bijection from $N$ to $[0, 1]$ . We can write the bijection as an infinite table.

$N$	$[0, 1]$
$0$	$0.12345\dots$
$1$	$0.14159\dots$
$2$	$0.23333\dots$
$3$	$0.10000\dots$
$4$	$0.67321\dots$

From this table, we’ll construct a new real number $r \in [0, 1]$ that is not in the above bijection. We construct the real number $r = 0. r_{0} r_{1} r_{2} \dots$ digit by digit. We’ll index digits in the right column of the table starting with $0$ (i.e., the first digit after the decimal point is the 0-th digit). Let $f$ denote the bijection described by the table.

Then, for each $i \in N$ , we define $r_{i} = d$ for any $d \in {0, 1, \dots, 9}$ such that $d \neq = f (i)_{i}$ . That is, the $i$ -th digit of $r$ will be explicit different from the $i$ -th digit of the real number $f (i)$ . As a picture, we look at the digits in the table on the diagonal.

$N$	$[0, 1]$
$0$	$0. [[1]] 2345\dots$
$1$	$0.1 [[4]] 159\dots$
$2$	$0.23 [[3]] 33\dots$
$3$	$0.100 [[0]] 0\dots$
$4$	$0.6732 [[1]] \dots$

Taking the positions on the diagonal, we construct the new real number $r = 0.27619 \dots$ . Now, there does not exist any $i \in N$ such that $f (i) = r$ . This is because for every $i$ , we have $r_{i} \neq = f (i)_{i}$ , which implies that $r \neq = f (i)$ . Thus, the mapping $f$ cannot exist. $□$

Time Hierarchies

With diagonalization fleshed out more, we can now discuss time hierarchies.

Deterministic Time Hierarchy

First, we’ll show a time hierarchy theorem for deterministic computations.

Theorem. Let $f$ and $g$ be time constructible functions such that $f = o (g / lo g (g))$ . Then $DTIME (f) ⊊ DTIME (g)$ .

As a corollary of the above theorem, we have:

Corollary. There exists a language $L$ decidable in time $f$ but not decidable in time $o (f / lo g (f))$ for any time constructible function $f$ .

We now prove the time hierarchy theorem.

Proof. As you might expect from the preceding discussion, we’ll have a diagonalization proof. First, we build a deterministic Turing machine $D$ as follows.

For any $x \in {0, 1}^{n}$ and for any $n \in N$ , $D (x)$ does the following.

Compute $k = ⌈ g (n) / lo g (g (n))⌉$ .
Simulate $M_{x} (x)$ .
If $M_{x} (x)$ halts within $k$ steps, output $1 - M_{x} (x)$ .
Else output $0$ .

Note that step (1) can be done in $O (g (n))$ time since $g$ is time constructible. If $M_{x}$ runs in time $T$ , then step (2) runs in time $O (T (n) lo g (T (n)))$ since we can do universal simulation with only logarithmic overhead.

Now, let $A = L (D)$ denote the language of $D$ ; i.e., $A = {x ∣ D (x) = 1}$ . By definition, $D$ decides $A .$ Moreover, $A \in DTIME (g)$ since $D (x)$ only simulates $M_{x}$ for at most $O (g (∣ x ∣) / lo g (g (∣ x ∣)))$ steps.

Claim. $A \in / DTIME (f)$ .

This is where our diagonalization comes into play. Suppose this claim is not true. This implies $A \in DTIME (f)$ , and there is a decider $M_{A}$ deciding $A$ in time $O (f)$ .

Now, consider running $D (⟨ M_{A} ⟩)$ . Suppose that $n = ∣ ⟨ M_{A} ⟩ ∣$ . Then, $D$ simulates $M_{A} (⟨ M_{A} ⟩)$ for at most $k = O (g (n) / lo g (g (n))$ steps. Notice that $M_{A}$ runs in time $f$ on any input. In particular, $M_{A} (⟨ M_{A} ⟩)$ runs in time $O (f (n))$ . By universal simulation, we know that $D$ simulates $M_{A} (⟨ M_{A} ⟩)$ in time $O (f (n) lo g (f (n)))$ . Since $f = o (g / lo g (g))$ we have that $M_{A} (⟨ M_{A} ⟩)$ halts in at most $f (n) = o (g (n) / lo g (g (n)))$ steps. So, for large enough $n$ ,¹ $M_{A}$ runs for less than $k = ⌈ g (n) / lo g (g (n))⌉$ steps. Moreover, by universal simulation, $D$ still runs in at most $O (g (n))$ steps on this input.

This implies that $D (⟨ M_{A} ⟩)$ completes the simulation of $M_{A} (⟨ M_{A} ⟩)$ and outputs $1 - M_{A} (⟨ M_{A} ⟩)$ . However, this implies that $D (⟨ M_{A} ⟩) \neq = M_{A} (⟨ M_{A} ⟩)$ . We assumed that $M_{A}$ decides the language $A$ , which $D$ also decides by definition, but these two machines differ on this input. This is a contradiction, so $M_{A}$ does not exist. $□$

We have two important corollaries from the time hierarchy theorem.

Corollary. For all $1 \leq ε_{1} < ε_{2}$ , we have $DTIME (n^{ε_{1}}) ⊊ DTIME (n^{ε_{2}})$ .

Corollary. $P ⊊ EXP$ .

Non-deterministic Time Hierarchy

Now, we move on to show the non-deterministic time hierarchy theorem.

Theorem. Let $f, g$ be time constructible functions such that $f (n + 1) = o (g (n))$ . Then, $NTIME (f) ⊊ NTIME (g)$ .

Proof. Unfortunately, we cannot do a standard diagonalization here. With the deterministic time hierarchy, we simulated the machine (which shouldn’t have existed), and were able to flip its output. The simulation was deterministic and always output the opposite of the machine $M_{A}$ . However, with non-deterministic simulation, there could be exponentially many outputs on a single input. Recall that a non-deterministic decider needs to only output accept on at least one computation path, and reject on all (when rejecting a string).

The idea behind the non-deterministic simulation will be to do a lazy simulation. In particular, our diagonalization will only differ on a single output; for all other outputs, we will output the correct bit (of the machine we are simulating). This will be enough to derive our contradiction.

We proceed with the proof. Let $i \in N$ and let $M_{i}$ denote the machine described by $⟨ i ⟩_{2}$ . Now let $f$ be a function such that $f (0) = 1$ , $f (1) = 2$ , and $f (i + 1) = 2^{f (i)^{2}}$ . Let $g$ be any function such that $f (n + 1) = o (g (n))$ .

Build a non-deterministic Turing machine $D$ which does the following. $D$ takes as inputs strings of the form $1^{n}$ for any $n \in N$ , where $1^{n}$ denotes the string of $n$ 1’s.

$D (1^{n}) :$

Compute $i$ such that $f (i) < n \leq f (i + 1)$ .
If $f (i) < n < f (i + 1)$ :
1. Non-deterministically simulate $M_{i} (1^{n + 1})$ for at most $g (n)$ steps.
2. If $M_{i}$ halts within $g (n)$ steps, output $M_{i} (1^{n + 1})$ .
3. Else output $1$ .
If $n = f (i + 1)$ :
1. Deterministically simulate $M_{i} (1^{f (i) + 1})$ by trying all computation paths.
2. Output $1 - M_{i} (1^{f (i) + 1})$ .

Now, we argue that $D (1^{n})$ runs in time $O (g (n))$ . First, step (1) takes at most $O (f (n)) = o (g (n))$ time. Second, all of step (2) only takes $O (g (n))$ time. Third, step (3.1) takes at most $O (2^{f (i) + 1}) = O (f (i + 1))$ time, which overall takes $O (g (n))$ time. Therefore, $D (1^{n})$ runs in time $O (g (n))$ .

Let $A = L (D)$ . By the above discussion, we know that $A \in NTIME (g)$ .

Claim. $A \in / NTIME (f)$ .

Again, suppose this is not the case. Then there is an NTM $M_{A}$ which decides $A$ in at most $O (f)$ time. Let $n = ∣ ⟨ M_{A} ⟩ ∣$ be large enough such that for $i$ satisfying $f (i) < n \leq f (i + 1)$ , we have $M_{i} = M_{A}$ .

Now, run $D (1^{n})$ . If $n < f (i + 1)$ , then $D$ simulates $M_{i} (1^{n + 1})$ for at most $g (n)$ steps. By construction, $M_{i} = M_{A}$ , and $M_{A} (1^{n + 1})$ runs in non-deterministic time $f (n + 1) = o (g (n))$ . So the simulation halts before $g (n)$ steps and $D (1^{n}) = M_{A} (1^{n + 1})$ .

This implies the following equalities. $D (1^{f (i) + 1}) = M_{A} (1^{f (i) + 2}) D (1^{f (i) + 2}) = M_{A} (1^{f (i) + 3}) D (1^{f (i) + 3}) = M_{A} (1^{f (i) + 4}) ⋮ D (1^{f (i + 1) - 1}) = M_{A} (1^{f (i + 1)})$

Moreover, by assumption, $M_{A}$ and $D$ both decide the same language $A$ . This implies for all $j \in {f (i) + 1, f (i) + 2, \dots, f (i + 1)}$ , we have $D (1^{j}) = M_{A} (1^{j}) .$ This actually shows that $D (1^{j}) = D (1^{j + 1})$ for all $j \in {f (i) + 1, f (i) + 2, \dots, f (i + 1) - 1}$ ; similarly, the same is true for $M_{A}$ : $M_{A} (1^{j}) = M_{A} (1^{j + 1})$ for all $j$ .

Now suppose that $n = f (i + 1)$ . By construction, $D (1^{n})$ now simulates $M_{A} (1^{f (i) + 1})$ deterministically and outputs $1 - M_{A} (1^{f (i) + 1})$ . Here, $M_{A}$ outputs $1$ if there exists an accepting path, and outputs $0$ otherwise. This implies that $D (1^{f (i + 1)}) \neq = M_{A} (1^{f (i) + 1})$ . But this is a contradiction since above we established that $D (1^{j}) = M_{A} (1^{j})$ for all $j \in {f (i) + 1, f (i) + 2, \dots, f (i + 1)}$ . Thus, $M_{A}$ cannot exist. $□$

Recall that every Turing machine has an infinite number of equivalent strings which describe said machine. ↩

Keyboard shortcuts