Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Lecture 2

In-class notes: CS 505 Spring 2025 Lecture 2

Measuring Runtime of Turing Machines

With the definition of Turing machines established, we can turn towards quantifying the run-time of Turing machines. Informally, the run-time of a Turing machine computing some function is the maximum amount of time needed to compute on all inputs (of a fixed length), where our measure of time corresponds to how many executions of the transition function needs to utilize to compute . First, we need to actually define what it means for a Turing machine to compute a function .

Definition (Turing Machine Computation). Let be a function and let be a Turing machine. We say that computes if for all , when is initialized with input it halts with on the output take. We denote this as .

Now we can define the run-time of a Turing machine computing a function .

Definition (Turing Machine Run-time). Let be a function and let be a Turing machine which computes . Furthermore, let be a function. We say that computes in time if in at most steps for all (i.e., steps). Here, a step of the Turing machine is a single execution of its transition function .

In essence, executing one step of the transition function of a Turing machine is the atomic Turing machine operation.

Time Constructible Functions

An important concept is the idea of time constructible functions, which we will use to quantify and show equivalences among different Turing machine models. It will also be used in later topics (e.g., time hierarchy theorems).

Definition (Time Constructible Function). Let be a function. Then we say that is time construictible if and only if (1) ; and (2) there exists a Turing machine such that for all , we have in time . Here, denotes the binary representation of .

Note

This is slightly different than how I presented it in class.

Examples of time constructible functions include Notably, , or any , are not time constructible.

The above examples of non-time constructible functions highlight a key idea behind time constructibility: a Turing machine (usually) needs to read its entire input in order to compute a function. The stipulation allows for a Turing machine to at least read the entire input before computing . If this restriction is removed, then is still time constructible (you simply ignore all inputs and write the constant to the output tape), but remain non-time constructible since the Turing machine is expected to compute in less time than it takes to read !

Turing Machine Equivalences

A function being time constructible turns out to be a key factor in how we define equivalences among Turing machines (and other models as well).1 Informally, we say that a computational model is equivalent to a computational model if any for any computation capable of being performed in can be performed in (with at most polynomial time overhead). In the context of Turing machines, we say that a computational model is equivalent to the Turing machine model if any problem solvable in time in model can be solved by a Turing machine running in time for constant . Intuitively, if is a program in computational model running in time , then a Turing machine will simulate model in order to run program (e.g., similar to modern interpreted programming languages like Python). If runs in time , and is not time constructible, then this simulation will not meet our requirements; i.e., it will not be an efficient simulation.

As mentioned in Lecture 1, the -tape Turing machine model we have been working with is equivalent to many other Turing machine (and non-Turing machine) models. We state these relations formally below.

First, recall that it is sufficient to consider a Turing machine which only uses a binary alphabet.

Lemma 2.1. For every and time constructible , if is computable in time by a Turing machine with tape alphabet , then it is computable in time by a Turing machine with tape alphabet .

Proof Sketch. The main idea is to encode the (non-start and non-blank) symbols of using bits. This requires roughly bits to uniquely encode in binary. Then the new Turing machine simply encodes each symbol from on its tapes in binary. To simulate a single step of , the machine must read bits from each tape, translate the symbol read into its current state, then execute ’s transition function.

Next, it turns out that any -tape Turing machine can be readily simulated by a single-tape Turing machine (which many of you may have seen before).

Lemma 2.2. For every and time constructible , if is computable in time by a -tape Turing machine, then it is computable in time by a single-tape Turing machine.

Proof Idea. The proof idea is to stagger the tapes onto the single-tape machine. Notably, since each of the tapes is infinite, if you try to write them side-by-side on a single-tape machine, you would inevitably run into a situation where you reach the end of an allocation for a work tape, so you’d have to shift the entire contents of the remaining tapes right one space. This would blow-up the time to simulate. So instead, you stagger the tapes. Consider tape . Then positions of tape would be written to positions on the single-tape machine.

It also turns out that having tapes which are infinite in both directions does not buy you much in terms of computational efficiency.

Lemma 2.3. For every and time constructible , if is computable in time by a -bidirectional tape Turing machine (i.e., every tape is infinite in both directions), then is computable in time by a standard -tape Turing machine (i.e., tapes that are infinite in one direction).

Proof Idea. You can approach this two different ways.

  1. Cut each bidirectional tape in half, then stagger this tape onto a single tape (similar to Lemma 2.2 above).

  2. If the bidirectional Turing machine has tape alphabet , let the standard Turing machine have tape alphabet . Then you can encode the bidirectional tape onto the single tape using .

Universal Turing Machines

We’ve discussed how our -tape Turing machine is equivalent to many other Turing machine models. Next, we will see that we can simulate any Turing machine (in any equivalent model). Much like how the modern computer can run any computation you give it, we will see there is a universal Turing machine which can simulate any Turing machine you give it as input.

Turing Machines are (Binary) Strings

We’ve focused our attention on Turing machines which compute some function , and we haven’t given much thought to how we write down the machine . It turns out that we can conveniently describe Turing machines simply as binary strings. We’ll let denote the binary string which represents the Turing machine . Note: there are an infinite number of strings which represent a single Turing machine .

For any , we will let denote the Turing machine specified by the string . In this light, notice that

  • We’ve always talked about Turing machines computing some function ;
  • Turing machines themselves are such a function; and
  • Turing machines can also be inputs to these functions!

So there must be a Turing machine which can take Turing machines as input and compute the function that this Turing machine would have computed! This is the universal Turing machine.

Theorem 2.4 (Hennie & Stearns, 1966). There exists a Turing machine such that for all , . That is, computes the output of when run with input . Moreover, if halts within steps on any input for time constructible , then halts in steps, where the hidden constant only depends on ’s alphabet size, number of states, and number of tapes.

The proof of the above theorem can be found below (see Proof of Theorem 2.4). Here, we’ll give the proof of the above with replaced with .

Proof with time bound . Suppose that . Without loss of generality, we can assume that the Turing machine has tape alphabet and has a single work tape (i.e., it is a -tape Turing machine). If not, then can transform into an equivalent Turing machine, denoted as , with these properties by Lemmas 2.1 and 2.2.2 In this case, if runs in time , then the resulting equivalent Turing machine runs in time (ignoring the factors since is fixed).

The universal machine will be a -tape Turing machine; i.e., one input tape, one output tape, and 3 work tapes. has alphabet . Now will simulate as follows.

  • uses its input, output, and first work tape to identically copy the operations performs on these tapes (recall has tapes).
  • encodes the state space of on its second work tape.
  • encodes the transition function of on its third work tape. The transition function is simply encoded as a table of key-value pairs.

In order to simulate a single step of ’s computation, the machine does the following.

  1. Read the current symbols under the input tape, output tape, and first work tape. This identically matches what does and takes constant time.
  2. Read the current state of from the second work tape. Since the tape alphabet is binary, the states of take bits to encode, so reading the current state takes time steps (i.e., move to the end of the current state, go back to the start of the work tape).
  3. Let be the current state, and let be the symbols read from the input, output, and first work tapes, respectively. Scan the third work tape for the key .
  4. Once this key is found, read the value from the corresponding table entry. The value is exactly , where for .
  5. Execute the transition function of .
    1. Write to the output head and to the head of work tape 1. This takes constant time.
    2. Write the new state to the second work tape and reset the tape head after. This take time.
    3. Move tape head direction for . This takes constant time.
    4. Move the head of the third work tape back to the start.

Now, the time complexity of (3) and (5.4) above are the same. In particular, in the worst case, must scan to the end of the table representing to find the correct state. There are keys in this table, and each key has an entry in . Since and because we can encode with only two more bits, we can conclude that each table entry (i.e., each key-value pair) has length . This means to write down a single entry, we need bits. Moreover, there are a total of entries in the table, so the total time of executing (3) or (5.4) is at most time.

Since is fixed, to simulate a single step of on requires time. So if runs in time , then runs in time . Now by the transformation we performed on to obtain . Thus, simulates in at most time.

Turing Machines and Languages

We’ve spent most of our time discussing Turing machines and how they compute functions. We’ll now shift to mostly talking about Turing machines in the context of deciding languages.

Recall that a language is simply a subset of . Notably, we can define a function as if and only if ; this immediately implies that if and only if . So there is a natural correspondence to computing functions and deciding set membership in a language .

Key to our later dealings with complexity classes will be the idea of Turing decidability. We’ll build up to this idea by first introducing Turing recognizability.

Definition (Turing Recognizable Language). A language is said to be Turing recognizable if there exists a Turing machine such that for all , . In particular, always halts and outputs if .

Recognizability only requires that the Turing machine halt on any valid member of the language. If, however, one hands this Turing machine , its behavior is undefined and not guaranteed! We’d like to strengthen this to make sure our Turing machine always halts, whether or not its input is in the language. This gives us decidability.

Definition (Turing Decidable Language). A language is said to be Turing decidable if there exists a Turing machine such that the following hold for any :

  • if and only if ; and
  • if and only if .

Notice that the above definition immediately means that halts on all possible inputs. This is because, equivalently stated, if then , where is the complement of , which is defined as (i.e., everything in but not in ).

An equivalent definition of decidability states that both the language and its complement are recognizable.

Lemma 2.5. A language is Turing decidable if and only if both and are Turing recognizable.

Undecidability

Unfortunately, there are many (interesting) languages that are undecidable; that is, there does not exist any Turing machine which decides the language. We’ll begin by showing the existence of at least one undecidable language.

Theorem 2.6. There exists a language that not Turing decidable (i.e., it is undecidable).

Proof. First define a language ; i.e., is the set of all strings such that the Turing machine , when given input its own description , halts and outputs . Now define the complement language . We claim that is undecidable.

We show this via a proof by contradiction. So towards contradiction, assume that is decidable. Then there exist a Turing machine which decides this language. This implies that for any , if and only if and if and only if .

Consider . We have that

Thus, we have a contradiction as . This implies that is undecidable.

Notably, the above proof technique is known as diagonalization. We’ll use it later when we discuss time hierarchy theorems.

Question

Is Turing recognizable?

Different from Class

I incorrectly stated in class that is was recognizable. However, is, in fact, not recongizable. This is because from the above proof is recognizable. By Lemma 2.5, if was Turing recognizable, then it would be decidable, but clearly it is not!

One may argue that the language is not a very interesting language class, and may not come up in the real world. However, we’ll take one-step up and consider a more interesting language that would be great for us if it were decidable! Unfortunately, it is not decidable.

The Halting Problem

The Halting problem asks the following simple question: given a Turing machine , does it halt on input ? More formally, it is specified by the following language:

Theorem 2.7. is undecidable.

We’ll give the proof of this theorem in Lecture 3.

Proof of Theorem 2.4

This proof is taken directly from Arora & Barak’s book with the following notes:

  • Theorem 1.9 in the proof corresponds to Theorem 2.3 in these lecture notes;
  • Claim 1.6 in the proof corresponds to Lemma 2.2 in these lecture notes; and
  • Claim 1.5 in the proof corresponds to Lemma 2.1 in these lecture notes.

The proof can be found in the following pdf: Proof of Theorem 2.4


  1. “Key” here meaing it makes proofs much simpler.

  2. Lemma 2.2 tells us that a -tape Turing machine can be simulated by a one-tape Turing machine with quadratic overhead. The same proof can be applied to reduce -tapes to -tapes, with a single input, output, and work tape (i.e., transform the work tapes into a single work tape, keep the input/output tapes the same).