\documentclass[oneside]{article} \usepackage{amsmath} \usepackage{amsthm} \usepackage{amsfonts} \usepackage[mathbf]{euler} \advance\oddsidemargin by-.5in \advance\textwidth by1in \newtheorem{theorem}{Theorem} \newtheorem{lemma}{Lemma} \newtheorem{definition}{Definition} \newcommand{\handout}[5]{ \noindent \begin{center} \framebox{ \vbox{ \hbox to 5.78in { {\bf 18.409 The Behavior of Algorithms in Practice} \hfill #2 } \vspace{4mm} \hbox to 5.78in { {\Large \hfill #5 \hfill} } \vspace{2mm} \hbox to 5.78in { {\it #3 \hfill #4} } } } \end{center} \vspace*{4mm} } \newcommand{\lecture}[4]{\handout{#1}{#2}{Lecturer: #3}{Scribe: #4}{Lecture #1}} \begin{document} \lecture{12}{3/21/2002}{Dan Spielman}{Arvind Sankar} \def\a{\mathbf{a}}\def\b{\mathbf{b}}\def\c{\mathbf{c}} \def\x{\mathbf{x}}\def\y{\mathbf{y}} \def\db{\mathbf{\Delta\b}} \def\dc{\mathbf{\Delta\c}} \def\dA{\Delta A}\def\da{\mathbf{\Delta\a}} \def\dL{\Delta L} \section{Introduction} Recall Dan's favorite linear program, from last class: \vskip\abovedisplayskip \begingroup \hsize=0.45\hsize \valign{#&\mathstrut\vbox{$#$}&\mathstrut\vbox{s.t.~$#$}\cr {\bf Primal}& \min \sum y_i& \begin{aligned}[t] \textstyle\sum_{i=1}^n y_i \a_i&=\c\\ y_i& \ge0\quad i=1,\dots,n \end{aligned}\cr {\bf Dual}& \max \c^T\x& \begin{aligned}[t] \a_i^T\x&\le1\quad i=1,\dots,n \end{aligned}\cr} \endgroup \vskip\belowdisplayskip For this program, the geometric interpretation in terms of the convex hull of~$\a_1,\dots,\a_n$ makes it obvious that the primal and dual have the same optimum~(or~{\em value}). In the first part of today's lecture, we will extend this property to arbitrary linear programs. Next, we will consider a refinement of the simple duality statement: \begin{align*} \text{Primal infeasible}&\implies\text{Dual unbounded} \\ \text{Dual infeasible}&\implies\text{Primal unbounded} \end{align*} due to Renegar~\cite{renegar95}, who defined a condition number for the primal and dual programs, and showed that the value of the primal can be bounded in terms of the condition number of the dual, and conversely. The condition number is thus a quantitative refinement of the notion of infeasibility. Renegar used his condition number to improve complexity-theoretic bounds on the running time of linear programming algorithms, which are worst-case bounds of the form \[ LP\in\mbox{DTIME}(n^kL) \] where~$k$ is a small constant~($3.5$ is known), and~$L$ measures the bit-length of the input. It is known, for example, that one can take~$L$ to be~$n$ times the maximum bit length of the numbers involved in the program. However, in practice, this bound proves to be overly pessimistic. Renegar showed that one could replace~$L$ by the logarithm of his condition number. In the worst case, the condition number turns out to be exponential in~$L$, but in practical applications, the linear programs are generally known to be well-conditioned~(if they weren't, the solution would not correspond very well to the real-world situation being modelled anyway). \section{Proof of LP duality} The general linear program can be written in the following form: \vskip\abovedisplayskip \begingroup \hsize=0.45\hsize \valign{#&\mathstrut\vbox{$#$}&\mathstrut\vbox{s.t.~$#$}\cr {\bf Primal}& \min \b^T\y& \begin{aligned}[t] \textstyle\sum_{i=1}^n y_i \a_i&=\c\\ y_i& \ge0\quad i=1,\dots,n \end{aligned}\cr {\bf Dual}& \max \c^T\x& \begin{aligned}[t] \a_i^T\x&\le b_i\quad i=1,\dots,n \end{aligned}\cr} \endgroup \vskip\belowdisplayskip It is very easy to prove the following statement, which is called the weak duality theorem. \begin{lemma} The value of the dual is less than or equal to the value of the primal. \end{lemma} \begin{proof} We will prove the slightly stronger statement that if~$\y$ is primal feasible, and~$\x$ is dual feasible, then \[ \c^T\x\le\b^T\y \] Let~$A$ denote the~$d\times n$ matrix \[ A=[\a_1\dots\a_n] \] Then we have \[ \c^T\x = \x^TA\y = (A^T\x)^T\y \le \b^T\y \] and the proof is complete. \end{proof} We will extend the strong duality result that is obvious for Dan's linear program to the general linear program in two steps: first, we consider the case where~$b_i>0$, and then, generalize to arbitrary~$\b$. \begin{lemma} The value of the primal is equal to the value of the dual, assuming that all the~$b_i$ are positive. \end{lemma} \begin{proof} We alter the linear program by defining \[ \a'_i=\frac{\a_i}{b_i}\qquad y'_i=b_iy_i\qquad b'_i=1 \] It can easily be seen that this new linear program is in the form of Dan's program, and the values of the primal and dual are exactly the same in both programs. \end{proof} Now for arbitrary~$b_i$: \begin{theorem}[Strong duality theorem] The value of the primal equals that of the dual, assuming that the dual is feasible. \end{theorem} \begin{proof} Let~$\x_0$ be dual feasible. Define \begin{align*} \x' &= \x-\x_0 \\ b'_i &= b_i-\a_i^T\x_0 \end{align*} The objective functions change as follows: \begin{align*} \c^T\x &= \c^T\x'+\c^T\x_0 \\ \b^T\y &= (\b')^T\y+\sum y_i\a_i^T\x_0 \\ &= (\b')^T\y+\c^T\x_0 \end{align*} Now,~$b'_i>0$ because~$\x_0$ satisfies the dual constraints, and by the previous lemma, we know that the optimal values of~$\c^T\x'$ and~$(\b')^T\y$ are equal, so we conclude that the optimal values of~$\c^T\x$ and~$\b^T\y$ are also equal. \end{proof} \def\norm#1{\left\|#1\right\|} \def\inorm#1{\norm{#1}_\infty} \def\sgn{\mathop{\mathgroup\symoperators sgn}} \section{Renegar condition number} Renegar, in~\cite{renegar95}, suggests analyzing the complexity of linear programming in terms of a condition number, rather than in terms of the bit length of the input. This allows him to derive a sharper upper bound on the running time of linear programming algorithms. In this section, we will consider the definition of the condition number, and apply it to obtain bounds on the value of the linear program. The results in this part of the lecture are from section~2 of~\cite{renegar95}. If~$L=(A,\b,\c)$ is an instance of linear programming, we define \[ \inorm{L} = \max(\norm{A}_{\max},\inorm{\b},\inorm{\c}) \] We will denote the value of the linear program~$L$ by~$v(L)$. \begin{definition} We define the primal and dual condition numbers of a linear program as \begin{align*} \kappa_P(L) &= \sup\{\delta:\inorm{\Delta L}<\delta\implies L+\Delta L\text{ is primal feasible}\} \\ \kappa_D(L) &= \sup\{\delta:\inorm{\Delta L}<\delta\implies L+\Delta L\text{ is dual feasible}\} \\ \end{align*} \end{definition} Note that the condition numbers can be interpreted as the~$\ell_\infty$-distance of the linear program for the set of infeasible linear programs. \begin{theorem}\label{th:renegarbd} \[ -\frac{\inorm{\b}\inorm{\c}}{\kappa_D(L)} \le v(L) \le \frac{\inorm{\b}\inorm{\c}}{\kappa_P(L)} \] \end{theorem} We start out with a simple but useful lemma that gives bounds on the size of the optimal solutions. \begin{lemma} If~$\x$ and~$\y$ are the optimal dual and primal solutions respectively, then \begin{align*} \norm\x_1 &\le \frac{\max\left(\inorm\b,-v(L)\right)}{\kappa_P(L)} \\ \norm\y_1 &\le \frac{\max\left(\inorm\c,v(L)\right)}{\kappa_D(L)} \end{align*} \end{lemma} \begin{proof} The first inequality is equivalent to \[ \kappa_P(L) \le \frac{\max\left(\inorm\b,-v(L)\right)}{\norm\x_1} \] So we need to find a perturbation that makes the problem primal infeasible. Since this will be true if the dual is unbounded, we look for a perturbation to~$A$ and~$\c$ that makes the dual unbounded. This will surely happen if \begin{align*} (\c+\dc)^T\x &> 0 \\ (A+\dA)^T\x &< 0 \end{align*} because then all sufficiently large positive multiples of~$\x$ will be feasible. Since~$\x$ is the solution to the dual problem, we know that \[ \c^T\x = v(L) \] Hence if~$v(L)>0$, we can take~$\dc=0$. Otherwise, choose \[ \dc = (-v(L)+\epsilon)\frac{\sgn\x}{\norm\x_1} \] where~$\sgn\x=(\sgn x_1,\dots,\sgn x_d)^T$ is the~$\pm1$~vector formed by the signs of the components of~$\x$, and~$\epsilon$ is any positive number. Note that \[ (\sgn\x)^T\x = \norm\x_1 \] so that \[ (\c+\dc)^T\x = \epsilon > 0 \] and \[ \inorm\dc = \frac{-v(L)+\epsilon}{\norm\x_1} \] Similarly, for~$dA$, we want \[ (\a_i+\da_i)^T\x < 0 \] and we know that \[ \a_i^T\x = b_i \] Hence if~$b_i<0$, take~$\da_i=0$, otherwise choose \[ \da_i = -(b_i+\epsilon)\frac{\sgn\x}{\norm\x_1} \] so that \[ \inorm{\da_i} \le \frac{|b_i+\epsilon|}{\norm\x_1} \le \frac{\inorm\b+\epsilon}{\norm\x_1} \] Hence, for all~$\epsilon>0$, we have \[ \kappa_P(L) \le \frac{\max(\inorm\b,-v(L))+\epsilon}{\norm\x_1} \] and this implies the stated inequality. For the second inequality, the proof is almost exactly the same. Set \begin{align*} \db &= \begin{cases}-(v(L)+\epsilon)\sgn\y/\norm\y_1 &\text{if $v(L)>0$}\\ 0 &\text{otherwise}\end{cases} \\ \dA_{ij} &= -c_j\frac{\sgn\y}{\norm\y_1} \end{align*} Except for a quirk of notation, the expression for~$\dA_{ij}$ is exactly analogous to the one given before for~$\da_i$, which could have been expressed as \[ \dA_{ij} = -(b_i+\epsilon)\frac{\sgn\x}{\norm\x_1} \] We have \begin{align*} (\b+\db)^T\y &< 0 \\ \sum_{i=1}^n y_i(\a_i+\da_i) &= 0 \end{align*} so that the primal is unbounded, and hence the dual is infeasible. Since \[ \norm{\dA}_{\max}=\frac{\inorm\c}{\norm\y_1}\] therefore \[ \kappa_D(L) \le \frac{\max(\inorm\c,v(L))+\epsilon}{\norm\y_1} \] for every~$\epsilon>0$, and this proves the stated inequality. \end{proof} Using this lemma, it is very easy to prove theorem~\ref{th:renegarbd}. For if~$\x$ and~$\y$ are as in the lemma, then \begin{align*} v(L)>0 &\implies v(L)=\c^T\x \le \inorm\c\norm\x_1 \le \frac{\inorm\c\inorm\b}{\kappa_P(L)} \\ v(L)<0 &\implies -v(L)=-\b^T\y \le \inorm\b\norm\y_1 \le \frac{\inorm\b\inorm\c}{\kappa_D(L)} \end{align*} and this is the theorem. The next theorem gives a bound on the change in~$v(L)$ that can occur as a result of perturbation of the linear program. \begin{theorem}\label{th:renegarerr} If~$\inorm\dL\le\frac12\min(\kappa_P(L),\kappa_D(L))$, then \[ \frac{|v(L+\dL)-v(L)|}{\inorm\dL} \le 4\frac{\inorm{L}\max(\inorm L,|v(L)|)}{\kappa_P(L)\kappa_D(L)} \] \end{theorem} We begin with a preliminary result that is simpler and illustrates the method of proof. \begin{lemma} If~$\dL=(0,\db,0)$ and~$\inorm\dL\le\frac12\kappa_D(L)$, then \[ \frac{v(L+\dL)-v(L)}{\inorm\dL} \le \frac{\max(\inorm\c,v(L))}{\kappa_D(L)} \] \end{lemma} \begin{proof} Assume WLOG~$v(L+\dL)\ge v(L)$, and let~$\y$ be an optimal primal solution of~$L$. Since neither~$A$ nor~$\c$ changes, it remains feasible for~$L+\dL$. Hence we have \begin{align*} v(L+\dL)-v(L) &\le (\b+\db)^T\y - \b^T\y \\ &= \db^T\y \\ &\le \inorm\db\norm\y_1 \\ &\le \inorm\db\frac{\max(\inorm\c,v(L))}{\kappa_D(L)} \end{align*} \end{proof} There is obviously an analogous result if only~$\c$ changes. Now let's see what we can do if~$A$ is perturbed. \begin{lemma} If~$\dL=(\dA,0,0)$, and~$\inorm\dL\le\frac12\min(\kappa_P(L),\kappa_D(L))$, then \[ \frac{|v(L+\dL)-v(L)|}{\inorm\dL} \le 2\frac{\max(\inorm\b,-v(L))\max(\inorm\c,v(L))} {\kappa_P(L)\kappa_D(L)} \] \end{lemma} \begin{proof} Suppose~$v(L+\dL)\ge v(L)$ and let~$\x$ be a dual solution for~$L+\dL$. Then we have that \[ (A+\dA)^T\x \le \b \implies A^T\x \le \b-\dA^T\x \] Hence~$\x$ is feasible for the dual of the perturbed system~$(A,\b-\dA^T\x,\c)$. By the previous lemma, we have \begin{align*} v(L+\dL)-v(L) &\le v(A,\b-\dA^T\x,\c)-v(L) \\ &\le \inorm{\dA^T\x}\frac{\max(\inorm\c,v(L))}{\kappa_D(L)} \\ &\le \norm\dA_{\max}\norm\x_1\frac{\max(\inorm\c,v(L))}{\kappa_D(L)} \\ &\le \norm\dA_{\max}\frac{\max(\inorm\b,-v(L+\dL))}{\kappa_P(L+\dL)} \frac{\max(\inorm\c,v(L))}{\kappa_D(L)} \\ &\le 2\inorm\dL\frac{\max(\inorm\b,-v(L))\max(\inorm\c,v(L))} {\kappa_P(L)\kappa_D(L)} \end{align*} On the other hand, if~$v(L+\dL)