\documentstyle[12pt,epsfig]{article} \textwidth=6in \oddsidemargin=0.25in \evensidemargin=0.25in \topmargin=-0.1in \footskip=0.8in \parindent=0.0cm \parskip=0.3cm \textheight=8.00in \setcounter{tocdepth} {3} \setcounter{secnumdepth} {2} \sloppy \newcommand{\p}{{\rm P}} \newcommand{\pspace}{{\rm PSPACE}} \newcommand{\np}{{\rm NP}} \newcommand{\conp}{{\rm coNP}} \newcommand{\exptime}{\hbox{EXPTIME}} \newcommand{\ti}[1]{{\rm TIME}(#1)} \newcommand{\spc}[1]{{\rm SPACE}(#1)} \newcommand{\nspace}[1]{{\rm NSPACE}(#1)} \newcommand{\conspace}[1]{{\rm coNSPACE}(#1)} \newcommand{\aspace}[1]{{\rm ASPACE}(#1)} \newcommand{\atime}[1]{{\rm ATIME}(#1)} \newcommand{\ap}{{\rm AP}} \newcommand{\al}{{\rm AL}} \newcommand{\settabs}{\hspace{0.25 in}\= \hspace{0.25 in}\= \hspace{0.25 in}\= \hspace{0.25 in}\= \hspace{0.25 in}\= \hspace{0.25 in}\= \hspace{0.25 in}\= \hspace{0.25 in}\= \hspace{0.25 in}\= \hspace{0.25 in}\= \hspace{0.25 in}\= \kill} \newcommand{\CHOOSE}{\textbf{choose~from}} \newcommand{\ACCEPT}{\textbf{accept}} \newcommand{\REJECT}{\textbf{reject}} \newcommand{\GOTO}{\textbf{goto}} \newcommand{\END}{\textbf{end}} \newcommand{\TRUE}{\textbf{true}} \newcommand{\FALSE}{\textbf{false}} \begin{document} \input{preamble.tex} \renewcommand{\FOR}{\textbf{for}} \renewcommand{\TO}{\textbf{to}} \renewcommand{\DO}{\textbf{do}} \renewcommand{\WHILE}{\textbf{while}} \renewcommand{\AND}{\textbf{and}} \renewcommand{\IF}{\textbf{if}} \renewcommand{\THEN}{\textbf{then}} \renewcommand{\ELSE}{\textbf{else}} \lecture{2}{February 4, 1999}{Daniel A. Spielman}{Antonio Ram\'\i rez} \section{The Immerman-Szelepcs\'enyi Theorem} Most of this lecture will be devoted to the proof of this theorem, which is stated as follows. \begin{theorem}[Immerman-Szelepcs\'enyi]\label{IS} For $f(n) \geq \log n$, \begin{displaymath} \nspace{f(n)} = \conspace{f(n)}. \end{displaymath} \end{theorem} A consequence of this theorem is that, in some sense, time complexity is more interesting than space complexity, as it is not known whether $\np = \conp$. Our proof will use two auxiliary results: \begin{lemma}\label{L1} Given a digraph $G$ with $n$ nodes, and a node $x$ of $G$, it is possible to compute the number of nodes reachable from $x$ in $\nspace{\log n}$. \end{lemma} \begin{lemma}\label{L2} Given two nodes $x, y$ of a digraph $G$, {\em and} the number of nodes $k$ reachable from $x$, there is a logspace NTM\footnote{Nondeterministic Turing machine.} that {\em rejects} if and only if $y$ is reachable from $x$.\end{lemma} \subsection{Applying the lemmas} We will apply these results as follows. Consider $M$, a $\conspace{f(n)}$ machine. A configuration of $M$ consists of the contents of the tape (up to the maximum $f(n)$ cells that $M$ is allowed to use at any time), the position of the head, and the finite state. Form a graph $G$ with all the configurations of $M$ as nodes, with a directed edge from ${\rm config}_1$ to ${\rm config}_2$ if $M$ can go from ${\rm config}_1$ to ${\rm config}_2$ in one step. We would like to show that there is an $\nspace{f(n)}$ machine that decides the same language as $M$. Recall that a co-nondeterministic machine accepts an input if and only if all computations with that input end in an accept state. Reinterpreting this condition using $G$, $M$ accepts if and only if there is {\em no} path from the starting node to a node denoting a reject state. Assume without loss of generality that $G$ has exactly one rejecting configuration. Let $x$ be the starting configuration of $M$, and $y$ the rejecting configuration. Consider a NTM $M'$ that does the following: \begin{enumerate} \item Compute the number of nodes of $G$ that are reachable from $x$. \item Using this number, accept if $y$ is not reachable from $y$. \end{enumerate} Because of Lemmas~\ref{L1} and \ref{L2}, $M'$ works in $\nspace{\log n}$, where $n$ is the number of nodes in $G$. But $G$ has $O(c^{f(n)})$ nodes, where $c$ is the alphabet size: each one of the $f(n)$ possible cells can take one of $c$ values (the other components of a configuration are not significant asymptotically). Hence, $M'$ works in $\nspace{\log O(c^{f(n)})} = \nspace{f(n)}$, as desired. This shows that $\conspace{f(n)} \subseteq \nspace{f(n)}$. But this implies the reverse inclusion too, since ${\rm co(coNSPACE)} = {\rm NSPACE}$. Then Theorem~\ref{IS} follows from the lemmas. \subsection{Computing functions nondeterministically} There is a subtle point in Lemma~\ref{L1}: we know what it means for an NTM to accept a language, but it is not immediately clear what it means for an NTM to {\em compute} a function. \begin{definition} An NTM $M$ is said to compute a function $f$ on input $x$ if $M$ has at least one accepting computation on input $x$ and, for all accepting computations of $M$ on $x$, $f(x)$ is left on the tape when done. \end{definition} To convince ourselves that this definition is the correct one, consider another plausible definition: $M$ computes $f(x)$ if at least one branch accepts with the right answer. This definition is flawed, however: nondeterminism allows $M$ to compute {\em every} possible answer, one of which must be the correct one. This is absurd, since it would mean that $M$ computes every function. \subsection{Proving the lemmas} \begin{proof-of-lemma}{\ref{L2}} Determining whether a node $v$ is reachable from $x$ by {\em accepting} can be done in $\nspace{\log n}$. The idea is to nondeterministically guess a sequence of edges to follow to reach $v$ from $x$: {\sffamily \begin{tabbing} \settabs \textbf{Algorithm:}\\ \>\>$u \leftarrow x$ \\ \>\>\FOR\ $j = 1\ldots n$\\ \>\>\>$u \leftarrow$~\CHOOSE\ $\{u\}\cup\{\textrm{neighbors of}\ u\}$ \\ \>\>\IF\ $u=v$ \THEN\ \ACCEPT\ \ELSE\ \REJECT \end{tabbing}} At each step, the algorithm nondeterministically follows an edge or stays at the same node, and after $n$ steps it checks to see if $v$ has been reached. At all times, it only stores the name of a node and the counter, which take space $O(\log n)$. We want an NTM that {\em rejects} when there is such a path, knowing the number $k$ of nodes reachable from $x$. The main idea for the proof is this: If there exist $k$ distinct nodes, all reachable from $x$ and each different from $y$, then $y$ is not reachable from $x$. This fact is used by the following algorithm: {\sffamily \begin{tabbing} \settabs \textbf{Algorithm:}\\ \>\>$i \leftarrow 0$\\ \>\>\FOR\ all nodes $v$ of $G$\\ \>\>\>nondeterministically \GOTO\ A or B\\ \>A:\>\>\IF\ $v$ is reachable from $x$\\ \>\>\>\>\IF\ $v = y$\\ \>\>\>\>\>\THEN\ \REJECT\\ \>\>\>\>\>\ELSE\ $i \leftarrow i+1$\\ \>B:\>\END\ \FOR\\ \>\>\IF\ $i < k$ \THEN\ \REJECT\ \ELSE\ \ACCEPT \end{tabbing}} On each iteration the algorithm is allowed to either ignore the vertex or to test it for connectedness from $x$, using the first algorithm\footnote{It is possible to call nondeterministic ``subroutines,'' as long as we have no actions that are conditional on the subroutine rejecting--if the subroutine rejects, the whole machine rejects on that execution.}. If it is connected from $x$, a counter is incremented. At the end, it accepts only if at least $k$ distinct nodes connected to $x$ were found. This algorithm proves the lemma. \end{proof-of-lemma} \begin{proof-of-lemma}{\ref{L1}} Let $S_i$ be the set of nodes reachable from $x$ in at most $i$ steps. The plan is to compute $|S_k|$ using $|S_{k-1}|$. If we succeed in doing this, the answer will be $|S_n|$. Clearly, $|S_0| = 1$. The idea of the algorithm is to go through all nodes, and for each try to test if it is in $S_k$. A node is in $S_k$ if and only if it is a neighbor of a node in $S_{k-1}$. What we would like to do is then: {\sffamily \begin{tabbing} \settabs \textbf{Algorithm:}\\ \>\>$l \leftarrow 0$\\ \>\>\FOR\ all nodes $w$ in $G$\\ \>\>\>\IF\ $w$ is a neighbor of a node in $S_{k-1}$ \THEN\ $l \leftarrow l+1$\\ \>\>\END\ \FOR\\ \>\>$|S_k| \leftarrow l$ \end{tabbing}} We must figure out how to determine in $\nspace{\log n}$ if a node is a neighbor of a node in $S_{k-1}$, knowing only $|S_{k-1}|$. This can be done as follows. {\sffamily \begin{tabbing} \settabs \textbf{Algorithm:}\\ \>\>$l \leftarrow 0$\\ \>\>\FOR\ all nodes $w$ in $G$\\ \>\>\>/* if $w$ is a neighbor of a node in $S_{k-1}$, increment $l$: */\\ \>\>\>$i \leftarrow 0$\\ \>\>\>$flag \leftarrow \FALSE$\\ \>\>\>\FOR\ all nodes $v$ in $G$\\ \>\>\>\>nondeterministically \GOTO\ A or B\\ \>A:\>\>\>\IF\ $v$ is reachable from $x$ in at most $k-1$ steps \THEN\\ \>\>\>\>\>\IF\ $w$ is a neighbor of $v$ \THEN\ $flag \leftarrow \TRUE$\\ \>\>\>\>\>$i \leftarrow i+1$\\ \>B:\>\>\>\END\ \IF\\ \>\>\>\END\ \FOR\ $v$\\ \>\>\>\IF\ $i < |S_{k-1}|$ \THEN\ \REJECT\\ \>\>\>\IF $flag = \textbf{true}$ \THEN\ $l \leftarrow l+1$\\ \>\>\END\ \FOR\ $w$\\ \>\>$|S_k| \leftarrow l$ \end{tabbing}} The inner loop looks at a set of nodes, ignoring those that are not in $S_{k-1}$, and it sets the flag to true if a node in $S_{k-1}$ is adjacent to $w$. At the same time, it counts how many nodes in $S_{k-1}$ it saw in total. By the end of this loop, if the flag is set, it means that $w$ is adjacent to some node of $S_{k-1}$, and hence in $S_k$. In this case, $l$ is incremented, provided that the inner loop saw all the nodes in $S_{k-1}$. More formally, we need to prove that: \begin{enumerate} \item[(i)] for a given $w$, $l$ is incremented only if $w\in S_k$, and \item[(ii)] for a given $w\in S_k$, $l$ is incremented if the computation does not reject. \end{enumerate} (i) is clear: for $l$ to be incremented, $flag$ must have been set to $\TRUE$, which happens only if a node $v$ is seen that is both a neighbor of $w$ and contained in $S_{k-1}$. This means that $w\in S_k$. To show (ii), consider some $w\in S_k$. Note that the non-rejecting executions of the inner loop are those in which the \textsf{A} branch is taken only for nodes in $S_{k-1}$. If not all of $S_{k-1}$ is hit, then we reject after verifying that $i < |S_{k-i}|$. Otherwise, we know that some node among the ones seen must be adjacent to $w$, so the flag is set, and both conditions for incrementing $l$ are met. Note that the execution that always takes branch \textsf{A} does {\em not} accept: if the inner loop sees a node that is not reachable from $x$ in at most $k-1$ steps, the whole computation rejects, because the subroutine that computes this rejects. Again, it is easily seen that this algorithm uses only logarithmic space. This concludes the proof. \end{proof-of-lemma} \section{Introduction to Alternation} Alternating Turing machines (ATMs) are a generalization of both nondeterministic and co-nondeterministic TMs. NTMs accept if at least one computation accepts; co-NTMs accept if all computations accept. ATMs mix both possibilities: they have the power of deciding at a nondeterministic branch state whether to accept if all branches accept, or if at least one accepts. \begin{definition} An alternating Turing machine is a Turing machine in which each non-halting state is labeled with either $\exists$ or $\forall$, and so that any state has at most two transitions originating from it. $\exists$-states are called existential, $\forall$-states are called universal. \end{definition} The limitation of two transitions from each node is for practical purposes, and does not affect the power of ATMs. To determine whether or not an ATM $M$ accepts an input $x$, consider its computation tree, which has a node for each possible state in an execution and a directed edge for each transition (Figure~\ref{atm-tree}). The root of the tree will be the starting state, and the leaves will be accepting or rejecting states. Working from bottom to top, label each node with ``accept'' or ``reject'': an $\exists$-state is labelled as ``accept'' if at least one of its children is accepting; a $\forall$-state is labelled as ``accept'' if both of its children accept. The outcome of the computation is the label of the root node. \begin{figure}[htb] \begin{center} \mbox{\psfig{figure=lecture2.eps}} \caption{The computation tree of an ATM.} \label{atm-tree} \end{center} \end{figure} \subsection{Alternating complexity classes} The ATM model determines several complexity classes. \begin{definition} \begin{eqnarray*} \atime{f(n)} & = & \{L\ |\ L\ \textrm{is a language decidable in time}\ O(f(n)) \textrm{by an ATM.}\}\\ \aspace{f(n)} & = & \{L\ |\ L\ \textrm{is a language decidable in space}\ O(f(n)) \textrm{by an ATM.}\}\\ \ap & = & \bigcup_{i=0}^\infty \atime{n^k}\\ \al & = & \aspace{\log n} \end{eqnarray*} \end{definition} Some known relations involving these new complexity classes are: \begin{eqnarray} \np & \subseteq & \ap \\ \conp & \subseteq & \ap \\ \pspace & = & \ap \label{future}\\ \al & = & \p \end{eqnarray} The first two relations are straightforward: NTMs and co-NTMs are special cases of ATMs. We will prove the last two in future lectures, but one of the directions of (\ref{future}) is easy: the $\pspace$-complete language TQBF (totally quantified boolean formulas) is easily seen to be in $\ap$; hence $\pspace \subseteq \ap$. A more interesting problem in $\ap$ is \begin{eqnarray*} \textrm{MIN-CIRCUIT} & = & \{\textrm{boolean circuits}\ c\ |\ \textrm{no circuit smaller than}\ c\\ & & \textrm{computes the same function as c.}\} \end{eqnarray*} The condition for a circuit to be minimal can be written as follows: \begin{displaymath} \textrm{MIN-CIRCUIT} = \{\textrm{boolean circuits}\ c\ |\ \forall\ \textrm{circuit}\ c'\ \textrm{where}\ |c'|<|c|, \exists x: c'(x) \neq c(x)\} \end{displaymath} Expressing the condition using quantifiers makes it easy to see how it might be solved in AP. This example is interesting because MIN-CIRCUIT is not known to be in NP. \end{document}