Loading [MathJax]/jax/output/HTML-CSS/jax.js

MathJax

Monday, September 28, 2015

Weak and strong alternatives

Weak alternatives
  • Two systems of inequalities are called weak alternatives if at most one of the two is feasible.
  • This is true whether or not the inequalities of system 1 are convex (i.e. fi convex, hi affine)
  • System 2 is always convex (g concave and λi0 are convex)
Lagrange duality theory can be applied to the problem of determining feasibility of a system of inequalities and equalities
fi(x)0,i=1,,m,hi(x)=0,i=1,,p
The above can be posed as the standard problem with objective f0=0:
minimize0subject tofi(x)0,i=1,,mhi(x)=0,i=1,,p
This problem has optimal value
p={0if system is feasibleif system is infeasible
We associate with the inequality system the dual function
g(λ,ν)=infxD(mi=1λifi(x)+pi=1νihi(x))
Note that the dual function has is positive homogeneous, i.e. that for α>0,g(αλ,αν)=αg(λ,ν).   The dual problem of maximizing g(λ,ν)s.t.λ0 has the optimal value
d={λ0,g(λ,ν)>0 is feasible0λ0,g(λ,ν)>0 is infeasible
From weak duality dp and the above facts, we can conclude that the inequality system
λ0,g(λ,ν)>0 is feasible d=, then the inequality system is infeasible (since p=)

A solution to the dual function is a certificate of infeasibility of the system.  To summarize:
  • feasibility of system 2 implies infeasibility of system 1
  • feasibility of system 1 implies infeasibility of system 2

Strong alternatives

When the original inequality system is convex, and some type of constraint qualification holds, then the pairs of weak alternatives becomes strong alternatives, which means exactly one of the two alternatives holds.

If system 1 is convex, it can be expressed as
fi(x)0,i=1,,m,Ax=b,ARp×n
This system and its alternative
λ0,g(λ,ν)>0
are strong alternatives provided there exists an xrelint D with Ax=b and the optimal value p is attained.  [More on that later...]

See Farkas' lemma

Friday, September 25, 2015

Karush-Kuhn-Tucker (KKT) conditions

If strong duality (with zero duality gap) holds and x,λ,ν - possibly more than one pair - are optimal (see previous post), and the problem has differentiable fi,hi, it must satisfy the KKT conditions: [necessary condition]
  1. primal constraints: fi(x)0,i=1,,m,hi(x)=0,i=1,,p
  2. dual constraints: λ0
  3. complementary slackness: λifi(x)=0,i=1,,m
  4. gradient of Lagrangian with respect to x vanishes:
    f0(x)+mi=1λifi(x)+pi=1νihi(x)=0
For convex problems

Key: When the primal problem is convex, the KKT conditions are also sufficient for points to be primal and dual optimal.

If ˜x,˜λ,˜ν satisfy KKT for a convex problem (i.e. fi convex, hi affine).  Then they are optimal, with zero duality gap:
  • from complementary slackness: f0(˜x)=L(˜x,˜λ,˜ν)
  • from 4th condition (and convexity): g(˜x,˜λ,˜ν)=L(˜x,˜λ,˜ν)
hence, f0(˜x)=g(˜x,˜λ,˜ν)


If Slater's condition is satisfied:

x is optimal iff there exist λ,ν that satisfy KKT conditions
  • Slater implies strong duality, therefore dual optimum is attained
  • generalizes optimality condition f0(x)=0 for unconstrained problem
Key: Many algorithms for convex optimization are conceived as methods for solving the KKT conditions.

Slater's condition

If the primal problem is convex i.e. of the form
minimizef0(x)subject tofi(x)0,i=1,,m,Ax=b with f0,,fm convex, and if there exists a point that is strictly feasible, or more precisely an xrelint D such that
fi(x)<0,i=1,,m,Ax=b.
Strong duality holds if the above condition is met.
Also, the dual optimum is attained d>

Thursday, September 24, 2015

Notes on duality

Given a standard form problem (not necessarily convex)
minimizef0(x)subject tofi(x)0,i=1,,mhi(x)=0,i=1,,p
variable xRn, domain D, optimal value p
with Lagrangian:
L(x,λ,ν)=f0(x)+mi=1λifi(x)+pi=1νihi(x)
and the (Lagrange) dual function:
g(λ,ν)=infxDL(x,λ,ν)
Properties:

  • g is concave over λ,ν since it is the infimum of a family of affine functions.
  • Lower bound property: if λ0 then g(λ,ν)p

since, if ˜x is feasible and λ0, then
g(λ,ν)=infxDL(x,λ,ν)L(˜x,λ,ν)f0(˜x) minimizing over all feasible ˜x gives g(λ,ν)p

Lagrange dual and conjugate function

Given an optimization problem with linear inequality and equality constraints
minimizef0(x)subject toAxbCx=d.
Using the definition of f=supxdom f(yTxf(x)), we can write the dual function as:
g(λ,ν)=infx(f0(x)+λT(Axb)+νT(Cx=d))=bTλdTν+infx(f0(x)+(ATλ+CTν)Tx)=bTλdTνf(ATλCTν) with domain
dom g={(λ,ν)|ATλCTνdom f0}

  • simplifies derivation of dual if conjugate of f0 is known

The dual problem (finding the greatest lower bound for the primal problem)
maximizeg(λ,ν)subject toλ0
  • a convex optimization problem; optimal value denoted d
  • λ,ν are dual feasible if λ0,(λ,ν)dom g

Weak duality: dp

  • always holds (for convex and nonconvex problems)
  • can be used to find lower bounds for difficult problems

Strong duality: d=p (zero duality gap)

  • does not hold in general
  • (usually) holds for convex problems
  • constraint qualifications assert conditions that guarantee strong duality in convex problems (e.g. Slater's constraint qualification)

Notes on dual norm

Dual norm is defined as
z=sup{zTx|x1} The dual norm can be interpreted as the operator norm of zT, interpreted as a 1×n matrix with the norm on Rn.

From the definition of dual norm we obtain the inequality
zTxxz
The conjugate of any norm f(x)=x is the indicator function of the dual norm unit ball
f(y)={0y1otherwise
Proof. If y>1, then by definition of the dual norm, there is a zRn with z1 and yTz>1.  Taking x=tz and letting t, we have
yTxx=t(yTzz) which shows that f(y)=.  Conversely, if y1, then we have yTxxy for all x, which implies for all x, yTxx0.  Therefore x=0 is the value that maximizes yTxx, with maximum value 0.

Wednesday, September 23, 2015

Schur complement: characterizations of positive definiteness for block matrices

Let a block matrix XSn be partitioned as

X=[ABBTC] where ASk.  If detA0, the schur complement is
S=CBTA1B The following characterizations of positive definiteness can be derived
  • X0 iff A0 and S0
  • If A0, then X0 iff S0
Example:
ATAt2I,t0tIt1ATIA0,t0[tIAATtI]0

Tuesday, September 22, 2015

Notes on optimization problem

Optimality criterion for differentiable f0
Given a convex optimization problem with objective f0.  x is optimal iff it is feasible and
f0(x)T(yx)0for all feasible y

if f0(x)0, it means that f0(x) defines a supporting hyperplane to the feasible set at x.

Unconstrained problem: x is optimal iff
xdom f0,f0(x)=0
Equality constrained problem:
minimize f0(x) subject to Ax=b
x is optimal iff there exists a ν such that
xdom f0,Ax=b,f0(x)+ATν=0
Note: Lagrange multiplier optimality condition.

Minimization over nonnegative orthant
minimize f0(x) subject to x0
x is optimal iff
xdom f0,x0,{f0(x)i0,xi=0f0(x)i=0,xi0
Note: The last condition is a complementarity condition.


Monday, September 21, 2015

Notes on conjugate function (Fenchel conjugate)

The conjugate function (Fenchel conjugate)
  • Counterpart to Fourier transform in Convex Analysis
The conjugate of f (f not necessarily convex):
f(y)=supxdom f(yTxf(x))

The domain of f consists of yRn for which the supermum is finite, i.e. for which the difference yTxf(x) is bounded above on dom f.

f is convex, since it is the pointwise supremum of a family of affine functions of y.

Examples:
  • Affine function. f(x)=ax+b.  f(y)=yxaxb=(ya)xb.  

f(y)={by=aotherwise with dom f={a}.

  • Negative logarithm. f(x)=logx, with dom f=R++.  

The function xy+logx is unbound above if y0 and reaches its max at x=1/y otherwise.
f(y)={1log(y)y<0otherwise with dom f={y|y<0}

Fenchel's inequality

Fenchel's inequality can be obtained from the definition of conjugate function
f(x)+f(y)xTy,x,y
since for each y,
f(y)xTyf(x)f(x)+f(y)xTy

Notes on convex functions

Basic Properties

[TODO]

Examples
  • Affine functions are both convex and concave
  • All norms are convex (e.g. Frobenius norm, Spectral norm)
  • Max function f(x)=maxixi is convex on Rn
  • Quadratic-over-linear function f(x,y)=x2/y is convex on dom f=R×R++
  • Log-sum-exp f(x)=logni=1exi is convex on Rn
  • Geometric mean f(x)=(ni=1xi)1/n is concave on dom f=Rn++
  • Log-det is concave on dom f=Sn++
Restriction of a convex function to a line

f:RnR is convex iff the function g:RR
g(t)=f(x+tv),dom g={t|x+tvdom f}
is convex in t for any xdom f, vRn

Upshot: can check convexity of f by checking convexity of functions of one variable.

Epigraph and sublevel set 

α-sublevel set of f:RnR, is defined as all x for which the value of f(x) is less then α
Cα={xdom f|f(x)α}

Property: sublevel sets of convex functions are convex (converse is false)

epigraph of f:RnR:
epi f={(x,t)Rn+1|xdom f,f(x)t}

Property: f is convex iff  epi f is a convex set

Operations that perserve convexity

methods for establishing convexity of a function
  1. verify definition (often simplified by restricting to a line)
  2. for twice differentiable functions, show 2f(x)0
  3. show that f is obtained from simple convex functions by operations that preserve convexity
    • nonnegative weighted sum
    • composition with affine function
    • pointwise maximum and supremum
    • composition
    • minimization
    • perspective

Notes on convex sets

Cones

Definition of a cone is quite general:

A set C is a cone if for every xC and θ0, we have θxC.  (Note that, a subspace is a cone since all (non-negative) combination of points in the subspace belongs to the subspace)

A set C is a convex cone if it is convex and a cone.

A proper cone on the other hand is more strict.

A cone KRn is a proper cone if it satisfies the following

  • K is convex
  • K is closed
  • K is solid, which means it has nonempty interior.
  • K is pointed, which means that it contains no line.

Dual cones and generalized inequalities

Dual cones of a cone K:
K={y|yTx0 for all xK}
Dual cones of proper cones are proper, so a dual cone defines generalize inequalities in the dual domain
yK0yTx0 for all xK0

Characterization of minimum and minimal elements via dual inequalities

minimum element w.r.t. K

Definition: x is minimum element of S iff for all λK0, x is the unique minimizer of λTz over S

minimal element w.r.t. K

Definitions:

  • if x minimizes λTz over S for some λK0, then x is minimal
  • if x is a minimal element of a convex set S, then there exists a nonzero λK0 such that x minimizes λTz over S