Lecture 8

In-class notes: CS 505 Spring 2025 Lecture 8

NP-Intermediate Languages

So far, we’ve looked at many NP problems that also happen to be NP-complete. Thus, it is a natural question to ask whether all languages in NP are NP-complete. It turns out, under the widely believed conjecture that $P \neq = NP$ .

Theorem. If $P \neq = NP$ , then there exists $L \in NP ∖ P$ such that $L$ is not NP-complete.

In other words: if all languages $L \in NP ∖ P$ are NP-complete, then $P = NP$ .

Example

Two examples of languages we believe to not be NP complete are factoring and graph isomorphism. Factoring asks if an integer $N$ has prime factors $p_{1}, p_{2}, \dots, p_{k}$ , and graph isomorphism asks if two graphs are isomorphic. This means that there exists a permutation $π$ such that for two graphs $G_{1}, G_{2}$ , if $(u, v) \in E (G_{1})$ , then $(π (u), π (v)) \in E (G_{2})$ .

Oracle Machines

When we showed $DTIME (f) ⊊ DTIME (g)$ , we utilized diagonalization. In the proof, we had a decider $D$ for some language $L \in DTIME (g)$ , and by contradiciton, we assumed we had a machine $M_{A}$ which decided $L$ in time $O (f)$ . In the machine $D$ , we received $M_{A}$ as input and simulated $M_{A}$ it.

At a high-level, diagonalization is possible because of two key properties.

Turing machines always have efficient representations as strings.
The universal simulation of any Turing machine given its efficient representation as a bit string does not examine the inner workings of the machine.

In the machines $D$ and $M_{A}$ above, the machine $D$ simulates $M_{A}$ obliviously: it does not even need to look at what $M_{A}$ is doing, it does not care about the internal mechanisms of $M_{A}$ . Thus, $D$ is treating $M_{A}$ as a black-box: it gives $M_{A}$ some input and $M_{A}$ gives $D$ some output.

Oracle Turing Machines

We can abstract these two properties and define oracle Turing machines. These are special Turing machines with an additional oracle tape, and access to some oracle $O$ . We denote this as $M^{O}$ . The machine $M^{O}$ can query $O (x)$ for any input $x$ in a single computation step, and $O (x)$ writes the output to the special oracle tape in this single computation step.

For a language $L$ , we let $M^{L}$ denote an oracle Turing machine with an oracle to a decider for the language $L$ . This allows us to define oracle complexity classes.

$P^{L}$ is the set of all languages decidable in deterministic polynomial time relative to the oracle/language $L$ .
$NP^{L}$ is the set of all languages decidable in non-deterministic polynomial time relative to the oracle/language $L$ .

As a concrete example, $P^{SAT}$ is the set of all languages decidable by a deterministic polynomial-time oracle Turing machine with oracle access to a decider for SAT. That is, $SAT (ϕ) = 1$ if and only if $ϕ$ is a satisfiable formula. Recall the complement language of SAT: $\overline{SAT}$ is the set of all unsatisfiable formula $ϕ$ .

Lemma. $\overline{SAT} \in P^{SAT}$ .

Proof. We build a deterministic polynomial time Turing machine $M$ that is given oracle access to $SAT$ such that $M^{SAT} (ϕ) = 1$ if and only if $ϕ$ is not satisfiable. The machine $M^{SAT}$ is simple. On input $ϕ$ , $M$ queries $SAT (ϕ)$ and outputs the opposite answer. Since $SAT (ϕ) = 1$ if and only if $ϕ$ is satisfiable, clearly $M^{SAT}$ decides $ϕ$ . Moreover, this is polynomial time. $□$

Actually, we can show a stronger result.

Lemma. $\overline{SAT} \in P^{L}$ for any NP-complete language $L$ .

At a high-level, given a formula $ϕ$ as input, we simply perform a polynomial-time reduction from $ϕ$ to the language $L$ . For example, we can reduce $ϕ$ to a $3 SAT$ instance, then reduce the $3 SAT$ instance to $L$ (or simply do a direct reduction).

As another result, we know that oracles in $P$ do not grant us more power for languages in $P$ .

Theorem. For any $L \in P$ , we have $P^{L} = P$ .

Proof. $P \subseteq P^{L}$ is immediate since every language in $P$ is decidable in polynomial time without an oracle. For $P^{L} \subseteq P$ , since $L \in P$ , we can convert any polynomial time Turing machine with oracle access to $L$ to another Turing machine which decides the same language but simply simulates $L$ . This simulation is polynomial time, so the final time is polynomial. $□$

Note there are powerful oracles for which $P$ and $NP$ are equal relative to this oracle. One such oracle is for the language $E = EXPCOM$ , which we define as the set of all tuples $(M, x, 1^{n})$ where $M (x) = 1$ within $2^{n}$ steps.

Lemma. $P^{E} = NP^{E} = EXP$ .

Proof. First, $EXP \subseteq P^{E}$ is immediate since any machine with oracle access to $E$ can check if an exponential time Turing machine $M$ outputs $M (x) = 1$ in at most $2^{n}$ steps. So an exponential time computation can be done in constant time.

Second, $P^{E} \subseteq NP^{E}$ is trivially true since $P \subseteq NP$ .

Third, we show that $NP^{E} \subseteq EXP$ . Suppose that $L \in NP^{E}$ and $L$ is decidable on NTM $M^{E}$ in time $n$ . We construct machine $D$ that decides $L$ in at most $O (2^{n})$ time. $D$ is given the description of $M$ and deterministically simulates $M$ on any input $x$ by simulating all possible computation paths. $D$ outputs accept if and only if there is at least one accepting computation path, and outputs reject otherwise. Since $M$ runs in non-deterministic time $n$ , this simulation takes at most $O (2^{n})$ time. Finally, whenever $M$ calls the oracle $E$ on input $(M, y, 1^{i})$ for some $i \leq n$ , the machine $D$ simply runs the machine $M (y)$ for at most $2^{i}$ steps and returns the result.

Limits of Diagonalization

Oracle machines help us quantify the limits of diagonalization. Diagonalization is quite a powerful, and general technique, so naturally we’d like to resolve $P$ vs. $NP$ using it. Unfortunately, this is impossible.

Theorem. There exists oracle $O_{1}$ and $O_{2}$ such that

$P^{O_{1}} = NP^{O_{1}}$ ; and
$P^{O_{2}} \neq = NP^{O_{2}}$ .

Proof. Setting $O_{1} = E$ (the $EXPCOM$ oracle), we have (1) of the theorem by our previous lemma.

Now we construct an oracle $O_{2}$ to prove (2). Interestingly enough, we will use diagonalization to construct this oracle. First, let $L \subseteq {0, 1}^{*}$ be any language. Define a new language $U_{L}$ as $U_{L} = {1^{n} ∣ \exists x \in L s.t. ∣ x ∣ = n} .$

Notice that $U_{L} \in NP^{L}$ for any language $L$ . To see this, define machine $M^{L} (1^{n})$ which (1) guesses $x \in {0, 1}^{n}$ ; (2) outputs oracle query $L (x)$ . By definition, $L (x) = 1$ if and only if $x \in L$ ; otherwise it outputs $0$ . Clearly, this non-deterministic machine decides $U_{L}$ .

Now, we construct a new language $B$ such that $U_{B} \in / P^{B}$ . Then, we will set $O_{2} = B$ , completing the proof. We’ll define the language $B$ inductively, first by setting $B = \emptyset$ . We’ll also have two helper sets (i.e., helper variables) $Q = \emptyset$ and $B^{'} = \emptyset$ .

Step 1.
1. Let $M_{1}$ be the deterministic oracle Turing machine defined by the bit string $⟨ 1 ⟩_{2}$ . For simplicity, we assume that $M_{1} (∣ x ∣)$ runs in time $O (∣ x ∣)$ .
2. Choose $n$ such that $2^{n} > n$ .
3. Run $b = M_{1}^{B} (1^{n})$ .
  1. Whenever $M_{1}$ queries oracle $B$ at string $q$ , reply with $0$ /reject. Update $Q = Q \cup {q}$ .
4. If $b = 1$ :
  1. Update $B^{'} = B^{'} \cup {0, 1}^{n}$ . That is, add all $n$ -bit strings to the set $B^{'}$ . Here, $B^{'}$ is representing the set of all strings that are not in $B$ .
5. If $b = 0$ :
  1. Find $x^{*} \in {0, 1}^{n}$ such that $x^{*} \in / Q$ .
  2. Update $B = B \cup {x^{*}}$ .
  3. Update $B^{'} = B^{'} \cup ({0, 1}^{n} ∖ {x^{*}})$ .
For $i = 2, 3, 4, \dots$ :
1. Assume machine $M_{i}$ runs in time $n^{i}$ for inputs of length $n$ .
2. Choose $n$ such that $2^{n} > n^{i}$ .
3. Run $b = M_{i}^{B} (1^{n})$ .
  1. Whenever $M_{i}$ queries oracle $B$ at string $q$ , update $Q = Q \cup {q}$ .
  2. If $q \in B^{'}$ , reply $0$ /reject.
  3. If $q \in B$ , reply $1$ /accept.
4. If $b = 1$
  1. Update $B^{'} = B^{'} \cup {0, 1}^{n}$ .
5. If $b = 0$
  1. Find $x^{*} \in {0, 1}^{n}$ such that $x^{*} \in / Q$ .
  2. Update $B = B \cup {x^{*}}$ .
  3. Update $B^{'} = B^{'} \cup ({0, 1}^{n} ∖ {x^{*}})$ .

Now, we claim that for this language $B$ , we have $U_{B} \in / P^{B}$ . First, clearly $U_{B} \in NP^{B}$ since an NTM can simply guess a correct string in the language $B$ . Now, let $M$ be any deterministic oracle Turing machine and suppose by way of contradiction that $M$ decides $U_{B}$ . Notice that for each $M_{i}$ , there are an infinite number of equivalent descriptions for $M_{i}$ . In particular, there exists some $i^{*}$ such that $M = M_{i^{*}}$ . Now, consider $M^{B} (1^{n})$ . If $M^{B} (1^{n}) = 1$ , it is saying that there exists $x \in B$ such that $∣ x ∣ = n$ . However, in this case, by construction of $B$ (and, in particular, step 2.4.1), we know that all strings of length $n$ are in the set $B^{'}$ and are not in the set $B$ . So $M^{B} (1^{n})$ should output $0$ in this case. Similarly, if $M^{B} (1^{n}) = 0$ , then by construction of $B$ , we know there exists some length $n$ string $x^{*} \in B$ , so $M^{B} (1^{n})$ should output $1$ in this case. Both cases lead to a contradiction, so $U_{B} \in / P^{B}$ . $□$

Intuitively, in the above diagonalization proof, we are exploiting two key facts: (1) there are an infinite number of equivalent Turing machine descriptions; and (2) deterministic Turing machines cannot search an exponential space in polynomial time. (1) allows us to say that if we are given some decider, then there is some $i^{*}$ for which $M_{i^{*}}$ is the same machine, which means we have considered it in our construction of $B$ . (2) allows us to diagonalize. In particular, $M_{i}^{B} (1^{n})$ must produce an output by only making a polynomial number of queries (at most $n^{i}$ ). Since $n^{i} < 2^{n}$ for large enough $n$ , we know that $M_{i}^{B}$ could not have possibly queried the entire set of length $n$ bit strings. So, intuitively, the deterministic machine is making a decision with incomplete information.

We exploit this. If $M_{i}^{B} (1^{n}) = 1$ , we declare all length $n$ strings to not be in the language. Since $M_{i}$ could not have queried all length $n$ strings before outputting its decision, the lack of information leads it to make a wrong decision. Similarly, if $M_{i}^{B} (1^{n}) = 0$ , then again there is no way $M_{i}$ could have queries all length $n$ bit strings. So we explicitly find one that was not queried and add it to the set $B$ . This again causes $M_{i}$ to output erroneously. Thus, we cannot decide the language $U_{B}$ in deterministic polynomial time when given an oracle to $B$ .

Keyboard shortcuts

CS 505 - Computability and Complexity Theory (Spring 2025)

Lecture 8

NP-Intermediate Languages

Oracle Machines

Oracle Turing Machines

Limits of Diagonalization