Skip to main content

Lagrangian Duality

The Lagrangian

We consider an optimization problem in the standard form:

minimizef0(x) subject to fi(x)0,i=1,,mhi(x)=0,i=1,,p\begin{array}{ll} \operatorname{minimize} & f_0(x) \\ \text { subject to } & f_i(x) \leq 0, \quad i=1, \ldots, m \\ & h_i(x)=0, \quad i=1, \ldots, p \end{array}

with variable xRnx \in \mathbf{R}^n. We assume its domain D=i=0mdomfii=1pdomhi\mathcal{D}=\bigcap_{i=0}^m \operatorname{dom} f_i \cap \bigcap_{i=1}^p \operatorname{dom} h_i is nonempty, and denote the optimal value by pp^{\star}. We do not assume the problem is convex.

The basic idea in Lagrangian duality is to take the constraints into account by augmenting the objective function with a weighted sum of the constraint functions. We define the Lagrangian L:Rn×Rm×RpRL: \mathbf{R}^n \times \mathbf{R}^m \times \mathbf{R}^p \rightarrow \mathbf{R} as

L(x,λ,ν)=f0(x)+i=1mλifi(x)+i=1pνihi(x)L(x, \lambda, \nu)=f_0(x)+\sum_{i=1}^m \lambda_i f_i(x)+\sum_{i=1}^p \nu_i h_i(x)

with domL=D×Rm×Rp\operatorname{dom} L=\mathcal{D} \times \mathbf{R}^m \times \mathbf{R}^p. We refer to λi\lambda_i as the Lagrange multiplier associated with the ii th inequality constraint fi(x)0f_i(x) \leq 0; similarly we refer to νi\nu_i as the Lagrange multiplier associated with the ii th equality constraint hi(x)=0h_i(x)=0. The vectors λ\lambda and ν\nu are called the dual variables or Lagrange multiplier vectors.

The Lagrange dual function

We define the Lagrange dual function (or just dual function) g:Rm×RpRg: \mathbf{R}^m \times \mathbf{R}^p \rightarrow \mathbf{R} as the minimum value of the Lagrangian over xx : for λRm,νRp\lambda \in \mathbf{R}^m, \nu \in \mathbf{R}^p,

g(λ,ν)=infxDL(x,λ,ν)=infxD(f0(x)+i=1mλifi(x)+i=1pνihi(x)).g(\lambda, \nu)=\inf _{x \in \mathcal{D}} L(x, \lambda, \nu)=\inf _{x \in \mathcal{D}}\left(f_0(x)+\sum_{i=1}^m \lambda_i f_i(x)+\sum_{i=1}^p \nu_i h_i(x)\right) .

When the Lagrangian is unbounded below in xx, the dual function takes on the value -\infty. Since the dual function is the pointwise infimum of a family of affine functions of (λ,ν)(\lambda, \nu), it is concave, even when the problem is not convex.

Lower bounds on optimal value

The dual function yields lower bounds on the optimal value pp^{\star} of the problem: For any λ0\lambda \succeq 0 and any ν\nu we have

g(λ,ν)p.g(\lambda, \nu) \leq p^{\star} .

This important property is easily verified. Suppose x~\tilde{x} is a feasible point, i.e., fi(x~)0f_i(\tilde{x}) \leq 0 and hi(x~)=0h_i(\tilde{x})=0, and λ0\lambda \succeq 0. Then we have

i=1mλifi(x~)+i=1pνihi(x~)0,\sum_{i=1}^m \lambda_i f_i(\tilde{x})+\sum_{i=1}^p \nu_i h_i(\tilde{x}) \leq 0,

since each term in the first sum is nonpositive, and each term in the second sum is zero, and therefore

L(x~,λ,ν)=f0(x~)+i=1mλifi(x~)+i=1pνihi(x~)f0(x~).L(\tilde{x}, \lambda, \nu)=f_0(\tilde{x})+\sum_{i=1}^m \lambda_i f_i(\tilde{x})+\sum_{i=1}^p \nu_i h_i(\tilde{x}) \leq f_0(\tilde{x}) .

Hence

g(λ,ν)=infxDL(x,λ,ν)L(x~,λ,ν)f0(x~).g(\lambda, \nu)=\inf _{x \in \mathcal{D}} L(x, \lambda, \nu) \leq L(\tilde{x}, \lambda, \nu) \leq f_0(\tilde{x}) .

Since g(λ,ν)f0(x~)g(\lambda, \nu) \leq f_0(\tilde{x}) holds for every feasible point x~\tilde{x}, the inequality follows.

The inequality holds, but is vacuous, when g(λ,ν)=g(\lambda, \nu)=-\infty. The dual function gives a nontrivial lower bound on pp^{\star} only when λ0\lambda \succeq 0 and (λ,ν)domg(\lambda, \nu) \in \operatorname{dom} g, i.e., g(λ,ν)>g(\lambda, \nu)>-\infty. We refer to a pair (λ,ν)(\lambda, \nu) with λ0\lambda \succeq 0 and (λ,ν)domg(\lambda, \nu) \in \operatorname{dom} g as dual feasible, for reasons that will become clear later.

The Lagrange Dual Problem

For each pair (λ,ν)(\lambda, \nu) with λ0\lambda \succeq 0, the Lagrange dual function gives us a lower bound on the optimal value pp^{\star} of the optimization problem. Thus we have a lower bound that depends on some parameters λ,ν\lambda, \nu. A natural question is: What is the best lower bound that can be obtained from the Lagrange dual function? This leads to the optimization problem

 maximize g(λ,ν) subject to λ0.\begin{array}{ll} \text { maximize } & g(\lambda, \nu) \\ \text { subject to } & \lambda \succeq 0 . \end{array}

This problem is called the Lagrange dual problem. In this context the original problem is sometimes called the primal problem. The term dual feasible, to describe a pair (λ,ν)(\lambda, \nu) with λ0\lambda \succeq 0 and g(λ,ν)>g(\lambda, \nu)>-\infty, now makes sense. It means, as the name implies, that (λ,ν)(\lambda, \nu) is feasible for the dual problem. We refer to (λ,ν)\left(\lambda^{\star}, \nu^{\star}\right) as dual optimal or optimal Lagrange multipliers if they are optimal for the dual problem.

The Lagrange dual problem is a convex optimization problem, since the objective to be maximized is concave and the constraint is convex. This is the case whether or not the primal problem is convex.

Example (Lagrange dual of inequality form LP)

In a similar way we can find the Lagrange dual problem of a linear program in inequality form

minimizecTx subject to Axb\begin{array}{ll} \operatorname{minimize} & c^T x \\ \text { subject to } & A x \preceq b \end{array}

The Lagrangian is

L(x,λ)=cTx+λT(Axb)=bTλ+(ATλ+c)TxL(x, \lambda)=c^T x+\lambda^T(A x-b)=-b^T \lambda+\left(A^T \lambda+c\right)^T x

so the dual function is

g(λ)=infxL(x,λ)=bTλ+infx(ATλ+c)Tx.g(\lambda)=\inf _x L(x, \lambda)=-b^T \lambda+\inf _x\left(A^T \lambda+c\right)^T x .

The infimum of a linear function is -\infty, except in the special case when it is identically zero, so the dual function is

g(λ)={bTλATλ+c=0 otherwise g(\lambda)= \begin{cases}-b^T \lambda & A^T \lambda+c=0 \\ -\infty & \text { otherwise }\end{cases}

The dual variable λ\lambda is dual feasible if λ0\lambda \succeq 0 and ATλ+c=0A^T \lambda+c=0. The Lagrange dual of the LP is to maximize gg over all λ0\lambda \succeq 0. Again we can reformulate this by explicitly including the dual feasibility conditions as constraints, as in

maximizebTλ subject to ATλ+c=0λ0\begin{array}{ll} \operatorname{maximize} & -b^T \lambda \\ \text { subject to } & A^T \lambda+c=0 \\ & \lambda \succeq 0 \end{array}

which is an LP in standard form. Note the interesting symmetry between the standard and inequality form LPs and their duals: The dual of a standard form LP is an LP with only inequality constraints, and vice versa.

Reference

Convex Optimization, Stephen Boyd, Lieven Vandenberghe