Convex Program#

\[\newcommand\realnumber{\mathbb{R}} \newcommand\vbx{\vb{x}}\]

Convex optimization#

convex program: objective and constraint functions are convex

\(f_i(x)\) is convex if its domain is convex set and \(f_i(\alpha x + \beta y) \le \alpha f_i(x) + \beta f_i(y)\) if \(\alpha + \beta = 1\), \(\alpha\ge 0\), \(\beta \ge 0\).

Examples: linear, quadratic functions

Special case: linear programming problems, least-square problems

standard form convex opt problem: \(f_0, f_1, \dots, f_m\) are convex: equality constraints are affine

important property: feasible set of a convex opt prob is cvx

generalized inequalities are also ok

\(\min c^T x\) subject to: \(A_0+A_1x_1 + A_2x_2 +\dots + A_n x_n\preceq 0\)

any locally opt point of a cvx program is (globally) optimal (proof by contradiction)

Optimality criteria for differentiable convex \(f_0\)#

\(x\) is optimal if and only if it is feasible and \(\nabla f_0(x)^T (y-x) \ge 0 \text{ for all feasible } y\)

Proof. directional derivative $\( \frac{d}{dt} f(x+tv) \mid_{t=0} = \nabla f(x)^T v \)$

uncontrained problem: \(x\) is optimal iff \(x\in dom f_0\), \(\nabla f_0(x)=0\)#

equality constrained problem#

\(\min f_0(x)\) subject to \(Ax = b\) \(x\) is optimal iff there exists a \(\nu\) such that \(x\in dom f_0, Ax=b, \nabla f_o(x) + A^T \nu = 0\)

Proof. By Null space and range Definition 24

minimization over nonnegative orthant#

minimize \(f_0(x)\) subject to \(x \preceq 0\), \(x\) is optimal iff \(x\in dom f_0\), \(x\preceq 0\), \(\nabla f_0(x)_i\ge 0,x_i=0\) and \(\nabla f_0(x)_i=0,x_i>0\).

Equivalent convex programs#

two problems are (informally) equivalent if the solution of one is readily obtained from the solution of the other, and vice-versa

eliminating equality constraints#

  • minimize \(f_0(x)\)

  • subject to \(f_i(x)\le 0\), \(Ax=b\) find

  • \(x_0\): a particular solution of \(Ax = b\)

  • matrix \(F\): whose range is the null space of \(A\), i.e., \(R(F)=N(A)\) is equivalent to

  • minimize (over z) \(f_0(Fz+ x_0)\)

  • subject to \(f_i(Fz+ x_0) \le 0\), \(i=1,\dots, m\)

introducing equality constraints#

introducing slack variables for linear inequalities#

  • minimize \(f_0(x)\)

  • subject to \(a_i^T x \le b_i\) is equivalent to

epigraph form:#

standard form cvx problem is equivalent to

minimizing over some variables#

small perturbation of the problem makes it very hard (potentially)

  • max convex or min concave

  • non-convex

Quasiconvex optimization#

\(f_0: \R^n \to \R\) quasiconvex, \(f_i\) convex, (local) optimal, … (global) optimal

such a representation always exists and not unique

Example 13

  • \(\phi_t(x) = 0, f_0(x) \le t; \infty , otherwise\)

  • \(\phi_t(x) = dist(x, \{z\mid f_0(z) \le t\})\)

  • convex over concave functions: \(f_0(x)=\frac{p(x)}{q(x)}\) with \(p\) convex, \(q\) concave, and \(q(x)>0\) on \(dom f_0\) can take \(\phi_t(x) = p(x) - tq(x)\):

    • for \(t\ge 0\), \(\phi_t\) convex in \(x\)

    • \(p(x)/q(x)\le t\) iff \(\phi_t(x)\le 0\).

quasiconvex opt via convex feasibility problems#

let \(p^*\) denote the optimal value of the quasiconvex program

  • minimize \(f_0(x)\)

  • s.t. \(f_i(x)\le 0\), \(Ax=b\)

solve the feasibility problem:

  • find \(x\)

  • s.t. \(\phi_t(x) \le 0\), \(f_i(x)\le 0\), \(Ax=b\)

bisection method for quasiconvex opt#

requires exactly \(O(\log_2 ((u-l)/\epsilon)\) iterations

Linear program (LP)#

general form

  • \(\min c^T x \)

  • \(\min c^T x + d\)

  • s.t. \(Gx\preceq h\), \(Ax = b\)

  • convex problem with affine objective and constraint functions

  • feasible set is a polyhedron (affine inequality defines half hyperplane)

standard form

  • \(\min c^T x \)

  • s.t. \(Ax=b\), \(x\succeq 0\)

canonical form

  • \(\min c^T x \)

  • s.t. \(Ax\succeq b\), \(x\succeq 0\)

convertible between (general, standard, canonical form)

Proof. introduct slack variable \(s\)

  • \(G(x^+ - x^-) + s = h\)

  • \(A(x^+-x^-) = b\)

  • \(x^+\succeq 0\), \(x^-\succeq 0\), \(s\succeq 0\) (\([x^+, x^-, s]\))

Example 14 (Piecewise-linear minimization)

  • minimize \(\max_{i=1,\dots, m} (a_i^T x+b_i)\) equivalent to an LP

  • minimize \(t\)

  • s.t. \(a_i^T x + b_i \le t\), \(i=1,\dots,m\)

Integer Linear program (IP, ILP)#

form

  • \(\min c^T x + d\)

  • s.t. \(Gx\preceq h\)

  • \(Ax = b\)

  • \(x\in Z_n\)

Example 16 (Shortest path problem)

find a path from \(s\) to \(t\) in directed graph \(G=(V,E)\) withthe smallest total cost

  • \(\min \sum_{(i,j)\in E} c_{ij} x_{ij}\)

  • subject to: \(\sum_{j:(i,j)\in E} x_{ij} - \sum_{j:(j,i)\in E} x_{ji} = \)

Example 17 (Min vertex cover)

find the smallest subset of vertices in undirected graph \(G=(V,E)\) that contains at least one endpoint of every edge in the graph

  • \(\min \sum_{i\in V} x_i\)

  • s.t. \(x_i + x_j \ge 1\), \(\forall (x,j) \in E\)

  • x_i \in {0,1 }, \(\forall \in E\)

Example 18 (Max independent set)

Example 19 (Max weighted bipartite matching (perfect matching))

Integer programming is generally NP-Hard.

One natural idea for solving ILPs: “relax” the integrality constraint, i.e., allow \(x\) to take on real values

problme: the LP’s optimum can be different from the optimal of the ILP to take on real values

problem: the LP’s optimum can be different from the optimal of the ILP. integrality gap: \(sup_i \frac{OPT(I)}{OPT_f(I)}\)

solution: try to round the solution of the LP to a feasible integral point and take that as the solution to the ILP

  • determining how to do rounding to achieve the optimum can be as hard as solving the original ILP itself

  • develop approximation algorithms that go from the optimal point of the LP to a point that is nearby the optimal point of the ILP, and bound how far off it is from the true optimal point

Example 20

solving the minimum vertex cover problem by relaxing the integer constraints to \(x_i\ge 0\), \(\forall i\in V\), and rounding the solution of the LP to obtain the (approximate) solution to the ILP

proof: let \(x_i^*\) be the optimal solution of the LP, we have \(x_i^* \in \{0, 1/2, 1 \}\). Then \(\sum_{i\in V} x_i \le \sum_{i\in V} 2x_i^* = 2\text{OPT}_f(I)\le 2\text{OPT}(I)\). approximation ratio: \(\alpha=2\)

Total unimodularity#

There are special cases that the relaxation is exact, i.e., the relaxed LP provides an integer optimal point.

An integer matrix is totally unimodular (TUM) if the determinant of every square submatrix of the amtrix is either -1, 0, or 1. e.g., incidence matrix of a directed graph; incidence matrix of an undirected bipartite graph

If A is TUM and b is an integer vector, all vertices of the polyhedron \(P=\{ x\mid Ax=b,x\succeq 0 \}\) (or \(P=\{ x\mid Ax\succeq b,x\succeq 0 \}\)) are integer

An LP in the standard or canonical form where constraint matrix A is TUM and b is an integer vector, has an integer optimal point

So an ILP whose LP relation satisfies the above conditions can be solved exactly by solving its LP relaxation! e.g. shortest path problem the incidence matrix correspond to the conservation constraint

Quadratic program (QP)#

Definition 25 (Quadratic program)

  • minimize (1/2)x^T P x + q^T x + r

  • s.t. Gx \preceq h, Ax = b

  • \(P\in S_+^n\), so objective is convex quadratic

  • minimize a convex quadratic function over a polyhedron

Examples#

Least square#

Example 21 (Least square)

\(\min || Ax - b||_2^2 = (Ax-b)^T(Ax-b) = ....\), set gradient to zero: \(2A^T A x - 2A^T b = 0\), the solution \(x^*=\)

Linear program with random cost#

  • \(\min c^T x + \gamma x^T \Sigma x = E[c^T x] + \gamma Var(c^T x\)

  • s.t. \(Gx \preceq h\), \(Ax=b\)

\(Var(c^T x)=x^T E[(c-\bar{c})(c-\bar{c})^T] x\) where covariance matrix \(Conv(c_i,c_j)=E[(c_i-\bar{c}_i)(c_j-\bar{c}_j)]\). \(\gamma>0\) is a risk aversion parameter; contrls the trade-off between expected cost and variance (risk)

Quadratically constrained quadratic program (QCQP)#

  • minimize \((1/2) x^T P_0 x + q_0^T x + r_0\)

  • s.t. \((1/2) x^T P_i x + q_i^T x + r_i \le 0\), \(Ax=b\)

Second-order cone program (SOCP)#

Example 22 (Second-order cone program (SOCP))

  • minimize \(f^T x\)

  • s.t \(||A_i x + b_i||_2 \le c_i^T x + d_i\), \(Fx=g\)

second-order cone

  • norm cone: \(\{ (x,t) \mid ||x||\le t \}\)

  • when the norm is Euclidean norm: second-order cone/quadratic cone/ice-cream cone

the inequality constraints are called second-order cone (SOC) constraints

More general than QCQP and LP

  • if \(n_i=0\), reduces to LP

  • if \(c_i=0\), reduces to QCQP

Robust linear program#

The parameters in an LP can be uncertain

  • minimize \(c^T x\)

  • s.t. \(a_i^T x \le b_i\), there can be uncertainty in \(c\), \(a_i\), \(b_i\)

Two common approaches to handle uncertainty in \(a_i\) (for simplicity)

  • convert to deterministic model: constraints must hold for all \(a_i\in E_i\)

  • convert to stochastic model: constraints must hold with probability \(\eta\) (\(a_i\) as the random variable)

deterministic approach via SOCP#

assume all \(a_i\) lie in given ellipsoids \(a_i\in E_i = \{ \bar{a}_i + P_i u \mid ||u||_2 \le 1 \}\)

derivation of the SOCP

  • the constraint is equivalent to \(prob(\frac{u-\bar{u}}{\sigma}\le \frac{b_i-\bar{u}}{\sigma})\ge \eta\) where \(\frac{u-\bar{u}}{\sigma}\) is a 0-mean, unit variance Gaussian variable

  • sup

Geometric programming#

  • monomial function \(f(x) = c x_1^{a_1} x_2^{a_2} \cdots x_n^{a_n}, dom(f) = R_{++}^n\) with \(c>0\); exponent \(\alpha_i\) can be any real number.

  • posynomial function: sum of monomials \(f(x) = \sum_{k=1}^K c_k x_1^{a_{1k}} x_2^{a_{2k}} \cdots x_n^{a_{nk}}, dom(f) = R_{++}^n\) with \(c>0\); exponent \(\alpha_i\) can be any real number.

difference with polynomial: \(\alpha_i \in R\) and \(dom(f) = R_{++}^n\)

gemetric program (GP) max s.t.

Geometric program in convex form#

change variables to \(y_i=\log x_i\), and take logarithm of objective, constraints

  • monomial \(f(x) = c x_1^{a_n} \cdots x_n^{a_n}\) transforms to \(\log f(e^{y_1},\dots, e^{y_n}) = a^T y + b, \; (b=\log c)\)

  • geometric program transforms to convex problem

    • minimize \(\log (\sum \exp(a^T y + b))\)

    • s.t.

Example 23 (Maximize the volume of a box-shaped structure)

  • optimization variables: height \(h\), width \(w\), depth \(d\)

  • constraints: a limit on the total wall area, a; a limit on the floor area, b; lower and upper bounds on the aspect ratio h/w, c and e;