Lecture 3

In-class notes: CS 505 Spring 2025 Lecture 3

Undecidability Wrap-up

We begin by wrapping up our discussion of undecidability.

The Halting Problem

From last time, we’ll finish proving that the halting problem is undecidable. First, recall the definition of the halting problem.

$L_{H} = {(α, x) ∣ M_{α} (x) halts in a finite number of steps} .$

Theorem 2.7. $L_{H}$ is undecidable.

Proof. We’ll prove this via a reduction to the language $L_{UC}$ from last lecture, defined as

$L_{UC} = {α ∣ M_{α} (α) = 0 or never halts} .$

Our proof will be by contradiction. In particular, this means we’ll assume that $L_{H}$ is decidable, then derive our contradiction by giving a decider for $L_{UC}$ .

Thus assume that $L_{H}$ is decidable. This means there is a Turing machine $M_{H}$ which decides $L_{H}$ . This tells us that for every pair $(α, x)$ , we have $M_{H} (α, x) = 1$ if and only if $M_{α} (x)$ halts, and $M_{H} (α, x) = 0$ if and only if $M_{α} (x)$ does not halt.

We’ll use $M_{H}$ to build a Turing machine $M_{UC}$ which decides $L_{UC}$ . For any $α$ , define $M_{UC}$ as follows.

$M_{UC} (α) :$
- Set $b = M_{H} (α, α)$ .
- If $b = 0$ , then output 1.
- If $b = 1$ , then set $b^{'} = M_{α} (α)$ .
  - If $b^{'} = 0$ , output 1.
  - If $b^{'} = 1$ , output 0.

Since $M_{H}$ is a decider, it halts on all possible inputs $(α, α)$ . Now, if $M_{H} (α, α) = 0$ , we know that $M_{α} (α)$ does not halt, which implies that $α \in L_{UC}$ . So we set $M_{UC} (α) = 1$ in this case. Next, if $M_{H} (α, α) = 1$ , we know that $M_{α} (α)$ does halt. We then test the output of $M_{α} (α)$ by running it. If $M_{α} (α) = 0$ , then again we know $α \in L_{UC}$ , so we set $M_{UC} (α) = 1$ . Otherwise, $M_{α} (α) = 1$ , and thus $α \neq \in L_{UC}$ , so we set $M_{UC} (α) = 0$ .

Thus, $M_{UC}$ halts on all possible inputs, and clearly $M_{UC}$ decides $L_{UC}$ . This contradicts our previous result that $L_{UC}$ is undecidable. Therefore, $L_{H}$ is undecidable. $□$

Final Remarks on Undecidability

Rice’s Theorem

It would be great if the halting problem were decidable, as it would give us an efficient way to check if programs halt on all possible inputs. However, one may be wondering if there are other properties about programs we can efficiently decide/test. For example, “does this program have at least 5 for-loops?” or “does this program have a switch statement, followed by an if-then-else?” Unfortunately, these are also undecidable problems.

This is a result known as Rice’s Theorem, which informally states that it is impossible to determine if a computer program has any non-trivial property $P$ . I.e., the language $L_{P} = {all programs with non-trivial property P}$ is undecidable. Here, a non-trivial property $P$ is a property which is not true or false for every program (i.e., there are some programs that satisfy $P$ , and some which do not).

Mathematical Incompleteness

The idea of undecidability (and uncomputability) is closely related to (and inspired by) Gödel’s incompleteness theorem. In the early 1900’s, there was a large push to establish a set of mathematical aximoms from which you can prove or disprove any mathematical property. However, Gödel proved this is impossible. He showed that no matter what set of axioms you choose, there will always be theorems you cannot prove or disprove. This actually inspired the results on undecidability/uncomputability, and is closely related to these ideas.

Time-Efficient Computations

We’ll now turn our focus to a central topic in complexity theory: defining classes of efficient computations. This leads us to defining and discussing various complexity classes. Informally, a complexity class is simply a set of languages which are decidable (resp., computable) within some resource bound. Example resource bounds include running in linear time, running in logarithmic space, etc.

Deterministic Time

Building towards what we as computer scientists consider efficient, we turn to time bounds. We’ll define the notion of deterministic time.

Definition. Let $T : N \to N$ be a function. A language $L$ is in the class DTIME $(T)$ if and only if $L$ is decidable by a (deterministic) Turing machine in time $O (T)$ .

All Turing machines we’ve discussed and defined so far have been deterministic. These machines all have straight-line computations: they execute their transition function, which simply outputs the next state. Later, we’ll see non-deterministic Turing machines, where the transition function can output a set of possible states and the Turing machine non-deterministically decides which state to pick next.

The Complexity Class P

Given the definition above, we can now define the set of (what we consider to be) all efficient computations. This is the complexity class P (which stands for polynomial).

Definition (P) $P = c \in N ⋃ DTIME (n^{c})$

We consider anything computed in polynomial time (with respect to the input length) to be efficient. Examples of problems/languages in P include:

Graph connectivity
Digraph path exists
Checking if a graph is a tree
Integer multiplication: does $x \cdot y = z$ ?
Are the integers $x$ and $y$ relatively prime?
Gaussian elimination over rational numbers: For matrix $A$ and vector $b$ , does there exist $x$ such that $A x = b$ ?

Discussions on P

Does the computational model matter?

We’ve defined P with respect to $k$ -tape Turing machines. But, as we’ve seen, $k$ -tape Turing machines are equivalent to all other Turing machine models we’ve seen, including RAM Turing machines which reasonably emulate real-life computers. Moreover, the “equivalence” here is that all machines can simulate all other ones with at most polynomial overhead in the runtime. This means all of these computations still fall within the class P.

In fact, many people believe that Turing machines can simulate any physically realizable computational model or system. This is known as the Church-Turing thesis.¹ Some people also believe in the strong Church-Turing thesis, which states that this simulation can be done with only polynomial overhead in the runtime. However, as we get closer to quantum computing being physically realizable, people may stop believing in this since, for now, we do not know of a way to simulate quantum computations on standard Turing machines with only polynomial overhead.

Why polynomial time?

It is certainly true that an algorithm running in time $n^{100}$ is impractical starting at $n = 2$ ; yet this is a polynomial. Why do we consider all polynomial time algorithms to be “efficient?”

One reason is above: the Turing machine is polynomially-equivalent to pretty much every model we have thought of, so it makes sense that polynomial time should appear somewhere in what we consider to be an efficient computation. Polynomials also compose well, which emulates how we compose computer programs. Often, computer programs will run sub-routines, and will run routines one after another. If all these runtimes are polynomial, then the final runtime remains polynomial as well. This is since for two polynomials $p (X)$ and $q (X)$ , the functions $p (X) + q (X)$ , $p (X) \cdot q (X)$ , and $p (q (X))$ or $q (p (X))$ are all still polynomials.

Another reason is historic and heuristic. Often in history, someone is able to solve a problem in polynomial time, but for some large polynomial like $n^{50}$ . But this algorithm is later improved to a more reasonable polynomial, such as $n^{5}$ or $n^{3}$ .

Finally, polynomial-time problems are roughly equivalent to most (if not all) problems that we can efficiently solve on modern computers.

Worst-case time complexity is too restrictive

If you have a problem where for $99%$ of the inputs you have an $n^{2}$ algorithm, but for $1%$ you have an $n^{7}$ algorithm, then we’d say the algorithm runs in time $O (n^{7})$ . In particular, we keep P as a worst-case class. Some argue that this is too restrictive, which is valid. However, often it is much simpler to construct an algorithm that can solve all problem inputs in some amount of time, rather than trying to enumerate (the possibly infinite amount of) the inputs which have better algorithms.

This criticism of P is also addressed in complexity theory itself via the introduction of alternative models and classes, including approxmiation algorithms and average-case complexity.

Decision problems are too limited

We’ve framed P as a class of decision problems, but often we actually want to find solutions to these problems. This is known as a search problem, where you are asked to find an answer rather than decide if something is true or false. An example of this is: instead of deciding if there exists an $x$ such that $A x = b$ , you just compute the solution $x$ . It can also be difficult to frame search problems as decision problems in the first place.

However, most often it is the case that the difference between search and decision problems is, again, only polynomial. That is, we often can solve a search problem when given an algorithm that decides the equivalent decision problem, only costing us polynomial overhead in the runtime; the reverse is often true as well.

Time-Efficient Verification of Problems

Sometimes, we don’t want to solve problems, but would like to verify solutions when given an answer. Moreover, this verification should at least as efficient as solving the problem itself.

Example

Suppose we are given a large integer $N \in N$ and would like to find the prime factors of $N$ , which we denote as $p_{1}, \dots, p_{k}$ . We believe it to be difficult to find $p_{1}, \dots, p_{k}$ given just $N$ . However, if someone gives you some numbers $q_{1}, \dots, q_{k}$ which are claimed to be the prime factors of $N$ , there is a simple and efficient algorithm to verify this is true.

Check that each $q_{i}$ is prime.
Check that $N = q_{1} \cdot \dots q_{k}$ .

Clearly (2) is efficient, only requiring $k$ integer multiplications. A relatively recent result showed that (1) is also efficient and doable in polynomial-time. So verifying that $N$ is the product of $q_{1}, \dots, q_{k}$ is also efficient.

Efficiently Verifiable Languages

This gives us a new way to define languages: efficiently verifiable languages.

Definition. Let $L \subset {0, 1}^{*}$ be a language. We say that $L$ is efficiently verifiable if there exists polynomials $p$ and $q$ , and Turing machine $M_{L}$ running in time $q$ such that $x \in L ⟺ \exists w \in {0, 1}^{p (∣ x ∣)} s.t. M_{L} (x, w) = 1.$

In the above definition, we call $M_{L}$ a verifier, $x$ the instance, and $w$ the certificate or witness.

The Class NP

The above new notion of languages gives us a new complexity class: NP.

Definition (NP). $NP = {L : L is efficiently verifiable} .$

P vs. NP

Central Question in all of Complexity Theory

True or false: $P = NP$ ?

We widely believe that

P \neq = NP

. In fact, we build many systems (e.g., cryptography) based on the above assumption. Resolving this either way is one of the [Millennium Prize Problems](https://www.claymath.org/millennium-problems/).

However, we do know one thing for certain.

Theorem 3.1. $P \subseteq NP$ .

This is true since every problem in $P$ can be decided in polynomial time with no witness/certificate. So it meets the definition of efficiently verifiable.

Non-deterministic Turing Machines and NP

There is an alternative definition of the class $NP$ , which utilizes non-deterministic Turing machines.

Definition. A non-deterministic $k$ -tape Turing machine is identical to a (deterministic) $k$ -tape Turing machine, except for the following modifications.

The transition function of the non-deterministic Turing machine is defined as $δ : Q \times Γ^{k} \to P (Q \times Γ^{k - 1} \times {L, R, S}^{k})$ , where $P$ denotes the power set operation.² During any step of the computation, the transition function outputs a (possibly empty) list of next possible Turing machine configurations.
Given a list of next possible configuration from the transition function, the non-deterministic Turing machine non-deterministically chooses the next configuration to execute from this list.³

Intuitively, deterministic Turing machines (the ones we defined in Lecture 1) are “straight-line”: every step of the computation proceeds directly from the previous one. For non-deterministic Turing machines (which we’ll denote as NTMs), they look more like “branching” programs: at every step of the computation, the Turing machine has a set of possible computational paths to head down, and non-deterministically chooses the path to proceed down.

How do we define decidability of a language with respect to NTMs? At first, it may seem difficult since there are many possible paths an NTM can do down during its computation. But, the answer turns out to be simple: we require all computational paths to halt, and there to be at least one accepting path (out of possibly exponential) which correctly outputs the decision.

Definition. A language $L$ is decidable in time $T$ by a non-deterministic Turing machine $M$ if

$x \in L$ if and only if there exists at least one execution path such that $M (x) = 1$ .
All execution branches halt in time at most $T (∣ x ∣)$ for any $x \in {0, 1}^{*}$ .

We can use this above definition to expand DTIME to NTIME.

Definition. Let $T : N \to N$ be a function. Then we define $NTIME (T)$ to be the set of all languages $L$ decidable by an NTM running in time $O (T)$ .

Alternative Definition of NP

Given NTMs and NTIME, we can now see the original formulation of the class NP.

Theorem 3.2. $NP = c \in N ⋃ NTIME (n^{c}) .$

Note that this definition is equivalent to the efficiently verifiable language definition. At a high level, this is because of the following reduction.

Let $w$ be a witness to the fact that $x \in L$ (i.e., $M_{L} (x, w) = 1$ for efficient verifier $M_{L}$ ). Then, intuitively, $w$ correpsonds to some correct computational path on an NTM which decides $L$ .
Let $M$ be an NTM which decides $L$ . Then we can specify a witness $w$ which is the computational path that $M$ takes to an accepting state. The deterministic machine $M_{L}$ takes this as input and simulates the NTM $M$ by following the computational path specified by $w$ .

Example

Recall our prime factor problem from before. Let $N \in N$ be a large integer, and suppose we wish to find the prime factors $p_{1}, \dots, p_{k}$ of $N$ . Then there is an extremely simple NTM which finds these prime factors. Let $M$ be this machine. It does the following.

Non-deterministically choose prime numbers $q_{1}, \dots, q_{k}$ .
Check if $N = q_{1} \cdot \dots \cdot q_{k}$ . If yes, output $1$ ; else output $0$ .

Solving NTIME in DTIME

Currently, until $P$ vs. $NP$ is resolved, the most efficient ways that we know of to solve problems in NTIME using only DTIME computations requires exponential time. Let $EXP$ denote the class $EXP = c \in N ⋃ DTIME (2^{n^{c}}) .$

Lemma 3.3. $NP \subset EXP$ .

Proof. Enumerate all possible branches of the NTM deciding the language $L \in NP$ (equivalently, enumerate all certificates/witnesses in the verifier definition). Then, run through this list until finding an accepting branch of the computation. If the original machine ran in time $T (n)$ , then this procedure runs in time $O (2^{T (n)})$ . By assumption, $T (n)$ is a polynomial, so we are done. $□$

Note this is just a belief and not a formal theorem or conjecture. ↩
Given a set $S$ , the power set of $S$ , denoted as $P (S)$ , is the set of all possible subsets of $S$ . Notably, $∣ P (S) ∣ = 2^{∣ S ∣}$ . ↩
Recall that non-determinism is not the same as behaving randomly. The choice of a non-deterministic machine is arbitrary and possibly not computable. ↩

Keyboard shortcuts

CS 505 - Computability and Complexity Theory (Spring 2025)