Convex Program#
Convex optimization#
convex program: objective and constraint functions are convex
\(f_i(x)\) is convex if its domain is convex set and \(f_i(\alpha x + \beta y) \le \alpha f_i(x) + \beta f_i(y)\) if \(\alpha + \beta = 1\), \(\alpha\ge 0\), \(\beta \ge 0\).
Examples: linear, quadratic functions
Special case: linear programming problems, least-square problems
standard form convex opt problem: \(f_0, f_1, \dots, f_m\) are convex: equality constraints are affine
important property: feasible set of a convex opt prob is cvx
generalized inequalities are also ok
\(\min c^T x\) subject to: \(A_0+A_1x_1 + A_2x_2 +\dots + A_n x_n\preceq 0\)
any locally opt point of a cvx program is (globally) optimal (proof by contradiction)
Optimality criteria for differentiable convex \(f_0\)#
\(x\) is optimal if and only if it is feasible and \(\nabla f_0(x)^T (y-x) \ge 0 \text{ for all feasible } y\)
Proof. directional derivative $\( \frac{d}{dt} f(x+tv) \mid_{t=0} = \nabla f(x)^T v \)$
uncontrained problem: \(x\) is optimal iff \(x\in dom f_0\), \(\nabla f_0(x)=0\)#
equality constrained problem#
\(\min f_0(x)\) subject to \(Ax = b\) \(x\) is optimal iff there exists a \(\nu\) such that \(x\in dom f_0, Ax=b, \nabla f_o(x) + A^T \nu = 0\)
Proof. By Null space and range Definition 24
Definition 24 (Null space)
\(N(A) = \{ x\in R^n \mid Ax = 0 \}\)
\(R(A) = \{ v\in R^n \mid Ax = v \}\)
minimization over nonnegative orthant#
minimize \(f_0(x)\) subject to \(x \preceq 0\), \(x\) is optimal iff \(x\in dom f_0\), \(x\preceq 0\), \(\nabla f_0(x)_i\ge 0,x_i=0\) and \(\nabla f_0(x)_i=0,x_i>0\).
Equivalent convex programs#
two problems are (informally) equivalent if the solution of one is readily obtained from the solution of the other, and vice-versa
eliminating equality constraints#
minimize \(f_0(x)\)
subject to \(f_i(x)\le 0\), \(Ax=b\) find
\(x_0\): a particular solution of \(Ax = b\)
matrix \(F\): whose range is the null space of \(A\), i.e., \(R(F)=N(A)\) is equivalent to
minimize (over z) \(f_0(Fz+ x_0)\)
subject to \(f_i(Fz+ x_0) \le 0\), \(i=1,\dots, m\)
introducing equality constraints#
introducing slack variables for linear inequalities#
minimize \(f_0(x)\)
subject to \(a_i^T x \le b_i\) is equivalent to
…
…
epigraph form:#
standard form cvx problem is equivalent to
minimizing over some variables#
small perturbation of the problem makes it very hard (potentially)
max convex or min concave
non-convex
…
Quasiconvex optimization#
\(f_0: \R^n \to \R\) quasiconvex, \(f_i\) convex, (local) optimal, … (global) optimal
such a representation always exists and not unique
Example 13
\(\phi_t(x) = 0, f_0(x) \le t; \infty , otherwise\)
\(\phi_t(x) = dist(x, \{z\mid f_0(z) \le t\})\)
convex over concave functions: \(f_0(x)=\frac{p(x)}{q(x)}\) with \(p\) convex, \(q\) concave, and \(q(x)>0\) on \(dom f_0\) can take \(\phi_t(x) = p(x) - tq(x)\):
for \(t\ge 0\), \(\phi_t\) convex in \(x\)
\(p(x)/q(x)\le t\) iff \(\phi_t(x)\le 0\).
quasiconvex opt via convex feasibility problems#
let \(p^*\) denote the optimal value of the quasiconvex program
minimize \(f_0(x)\)
s.t. \(f_i(x)\le 0\), \(Ax=b\)
solve the feasibility problem:
find \(x\)
s.t. \(\phi_t(x) \le 0\), \(f_i(x)\le 0\), \(Ax=b\)
bisection method for quasiconvex opt#
requires exactly \(O(\log_2 ((u-l)/\epsilon)\) iterations
Linear program (LP)#
general form
\(\min c^T x \)
\(\min c^T x + d\)
s.t. \(Gx\preceq h\), \(Ax = b\)
convex problem with affine objective and constraint functions
feasible set is a polyhedron (affine inequality defines half hyperplane)
standard form
\(\min c^T x \)
s.t. \(Ax=b\), \(x\succeq 0\)
canonical form
\(\min c^T x \)
s.t. \(Ax\succeq b\), \(x\succeq 0\)
convertible between (general, standard, canonical form)
Proof. introduct slack variable \(s\)
\(G(x^+ - x^-) + s = h\)
\(A(x^+-x^-) = b\)
\(x^+\succeq 0\), \(x^-\succeq 0\), \(s\succeq 0\) (\([x^+, x^-, s]\))
Example 14 (Piecewise-linear minimization)
minimize \(\max_{i=1,\dots, m} (a_i^T x+b_i)\) equivalent to an LP
minimize \(t\)
s.t. \(a_i^T x + b_i \le t\), \(i=1,\dots,m\)
Example 15 (Largest sphere inside a polyhedron)
\(P=\{ x \mid a_i^T x \le b_i, i = 1, \dots, m \}\)
Chebyshev center of a polyhedron center of the largest sphere \(B=\{ x_c + u \mid ||u||_2 \le r \}\)
hence, \(x_c\), \(r\) can be determined by solving the LP
max \(r\)
s.t. \(x_c + u \in P\), \(\forall ||u||_2 \le r\)
\(a_i^T (x_c +u) \le b_i\), \(i=1,\dots,m\), \(\forall ||u||_2 \le r\)
\(a_i^T x_c + a_i^T u \le b_i\), \(i=1,\dots,m\), \(\forall ||u||_2 \le r\)
\(a_i^T x_c + sup_{||u||_2 \le r} a_i^T u \le b_i\), \(i=1,\dots,m\)
Integer Linear program (IP, ILP)#
form
\(\min c^T x + d\)
s.t. \(Gx\preceq h\)
\(Ax = b\)
\(x\in Z_n\)
Example 16 (Shortest path problem)
find a path from \(s\) to \(t\) in directed graph \(G=(V,E)\) withthe smallest total cost
\(\min \sum_{(i,j)\in E} c_{ij} x_{ij}\)
subject to: \(\sum_{j:(i,j)\in E} x_{ij} - \sum_{j:(j,i)\in E} x_{ji} = \)
Example 17 (Min vertex cover)
find the smallest subset of vertices in undirected graph \(G=(V,E)\) that contains at least one endpoint of every edge in the graph
\(\min \sum_{i\in V} x_i\)
s.t. \(x_i + x_j \ge 1\), \(\forall (x,j) \in E\)
x_i \in {0,1 }, \(\forall \in E\)
Example 18 (Max independent set)
Example 19 (Max weighted bipartite matching (perfect matching))
Integer programming is generally NP-Hard.
One natural idea for solving ILPs: “relax” the integrality constraint, i.e., allow \(x\) to take on real values
problme: the LP’s optimum can be different from the optimal of the ILP to take on real values
problem: the LP’s optimum can be different from the optimal of the ILP.
integrality gap: \(sup_i \frac{OPT(I)}{OPT_f(I)}\)
solution: try to round the solution of the LP to a feasible integral point and take that as the solution to the ILP
determining how to do rounding to achieve the optimum can be as hard as solving the original ILP itself
develop
approximation algorithmsthat go from the optimal point of the LP to a point that is nearby the optimal point of the ILP, and bound how far off it is from the true optimal point
Example 20
solving the minimum vertex cover problem by relaxing the integer constraints to \(x_i\ge 0\), \(\forall i\in V\), and rounding the solution of the LP to obtain the (approximate) solution to the ILP
proof: let \(x_i^*\) be the optimal solution of the LP, we have \(x_i^* \in \{0, 1/2, 1 \}\). Then \(\sum_{i\in V} x_i \le \sum_{i\in V} 2x_i^* = 2\text{OPT}_f(I)\le 2\text{OPT}(I)\). approximation ratio: \(\alpha=2\)
Total unimodularity#
There are special cases that the relaxation is exact, i.e., the relaxed LP provides an integer optimal point.
An integer matrix is totally unimodular (TUM) if the determinant of every square submatrix of the amtrix is either -1, 0, or 1.
e.g., incidence matrix of a directed graph;
incidence matrix of an undirected bipartite graph
If A is TUM and b is an integer vector, all vertices of the polyhedron \(P=\{ x\mid Ax=b,x\succeq 0 \}\) (or \(P=\{ x\mid Ax\succeq b,x\succeq 0 \}\)) are integer
An LP in the standard or canonical form where constraint matrix A is TUM and b is an integer vector, has an integer optimal point
So an ILP whose LP relation satisfies the above conditions can be solved exactly by solving its LP relaxation!
e.g. shortest path problem
the incidence matrix correspond to the conservation constraint
Quadratic program (QP)#
Definition 25 (Quadratic program)
minimize (1/2)x^T P x + q^T x + r
s.t. Gx \preceq h, Ax = b
\(P\in S_+^n\), so objective is convex quadratic
minimize a convex quadratic function over a polyhedron
Examples#
Least square#
Example 21 (Least square)
\(\min || Ax - b||_2^2 = (Ax-b)^T(Ax-b) = ....\), set gradient to zero: \(2A^T A x - 2A^T b = 0\), the solution \(x^*=\)
Linear program with random cost#
\(\min c^T x + \gamma x^T \Sigma x = E[c^T x] + \gamma Var(c^T x\)
s.t. \(Gx \preceq h\), \(Ax=b\)
\(Var(c^T x)=x^T E[(c-\bar{c})(c-\bar{c})^T] x\) where covariance matrix \(Conv(c_i,c_j)=E[(c_i-\bar{c}_i)(c_j-\bar{c}_j)]\). \(\gamma>0\) is a risk aversion parameter; contrls the trade-off between expected cost and variance (risk)
Quadratically constrained quadratic program (QCQP)#
minimize \((1/2) x^T P_0 x + q_0^T x + r_0\)
s.t. \((1/2) x^T P_i x + q_i^T x + r_i \le 0\), \(Ax=b\)
Second-order cone program (SOCP)#
Example 22 (Second-order cone program (SOCP))
minimize \(f^T x\)
s.t \(||A_i x + b_i||_2 \le c_i^T x + d_i\), \(Fx=g\)
second-order cone
norm cone: \(\{ (x,t) \mid ||x||\le t \}\)
when the norm is Euclidean norm: second-order cone/quadratic cone/ice-cream cone
the inequality constraints are called second-order cone (SOC) constraints
More general than QCQP and LP
if \(n_i=0\), reduces to LP
if \(c_i=0\), reduces to QCQP
Robust linear program#
The parameters in an LP can be uncertain
minimize \(c^T x\)
s.t. \(a_i^T x \le b_i\), there can be uncertainty in \(c\), \(a_i\), \(b_i\)
Two common approaches to handle uncertainty in \(a_i\) (for simplicity)
convert to
deterministicmodel: constraints must hold for all \(a_i\in E_i\)convert to
stochasticmodel: constraints must hold with probability \(\eta\) (\(a_i\) as the random variable)
deterministic approach via SOCP#
assume all \(a_i\) lie in given ellipsoids \(a_i\in E_i = \{ \bar{a}_i + P_i u \mid ||u||_2 \le 1 \}\)
derivation of the SOCP
the constraint is equivalent to \(prob(\frac{u-\bar{u}}{\sigma}\le \frac{b_i-\bar{u}}{\sigma})\ge \eta\) where \(\frac{u-\bar{u}}{\sigma}\) is a 0-mean, unit variance Gaussian variable
sup
Geometric programming#
monomialfunction \(f(x) = c x_1^{a_1} x_2^{a_2} \cdots x_n^{a_n}, dom(f) = R_{++}^n\) with \(c>0\); exponent \(\alpha_i\) can be any real number.posynomialfunction: sum of monomials \(f(x) = \sum_{k=1}^K c_k x_1^{a_{1k}} x_2^{a_{2k}} \cdots x_n^{a_{nk}}, dom(f) = R_{++}^n\) with \(c>0\); exponent \(\alpha_i\) can be any real number.
difference with polynomial: \(\alpha_i \in R\) and \(dom(f) = R_{++}^n\)
gemetric program (GP)
max
s.t.
Geometric program in convex form#
change variables to \(y_i=\log x_i\), and take logarithm of objective, constraints
monomial \(f(x) = c x_1^{a_n} \cdots x_n^{a_n}\) transforms to \(\log f(e^{y_1},\dots, e^{y_n}) = a^T y + b, \; (b=\log c)\)
geometric program transforms to convex problem
minimize \(\log (\sum \exp(a^T y + b))\)
s.t.
Example 23 (Maximize the volume of a box-shaped structure)
optimization variables: height \(h\), width \(w\), depth \(d\)
constraints: a limit on the total wall area, a; a limit on the floor area, b; lower and upper bounds on the aspect ratio h/w, c and e;