Convex Set and Functions#
The great watershed in optimization isn’t between linearity and non-linearity, but convexity and non-convexity. -R. Tyrrell Rockafellar ()
Convex Definitions#
Definition 3 (Convex set)
contains line segment between any two points in the set \(x_1, x_2 \in C, 0 \le \theta \le 1 \to \theta x_1 + (1-\theta) x_2\)
Convex hull#
affine set: contains the line through any two distinct points in the set
Definition 4 (Convex combination)
convex combination of \(x_1, \dots, x_k\): any point \(x\) of the form \(x=\theta_1 x_1 + \theta_2 x_2 + \cdots + \theta_k x_k\) with \(\sum_i \theta_i = 1\), \(\theta_i \ge 0\)
Definition 5 (Convex hull)
convex hull conv \(S\): set of all convex combinations of points in \(S\) \(conv(S)=\{\theta_1 x_1 + \theta_2 x_2 + \cdots + \theta_k x_k \mid x_i\in S, \sum_i \theta_i = 1\), \(\theta_i \ge 0\}\)
Convex cone#
Definition 6 (Cone)
a set \(C\) is a cone, if for every \(x\in C\) and \(\theta \ge 0\), we have \(\theta x \in C\).
Definition 7 (Convex cones)
a set \(C\) is a convex cone, if it is a cone and convex, i.e., \((\forall x_1, x_2 \in C, \theta_1 \ge 0, \theta_2 \ge 0) (\theta_1 x_1 + \theta_2 x_2 \in C)\)
conic (nonnegative) combination of \(x_1\) and \(x_2\): any point of the form \(x=\theta_1 x_1 + \theta_2 x_2\) with \(\theta_1\ge 0, \theta_2 \ge 0\).
a convex cone \(K\subseteq R^n\) is a proper cone if
\(K\) is closed (contains its boundary)
\(K\) is solid (has nonempty interior)
\(K\) is pointed (contains no line), i.e., \(x\in K, -x \in K \to x=0\)
Example 2
nonnegative orthant
positive semidefinite cone
Positive semidefinite#
\(S^n\) is the set of symetric \(n\times n\) matrices
\(S^n_+=\{X\in S^n\mid X\succeq 0\}\):
positive semidefinite\(n\times n\) matrices (iff \(\forall z, z^T X z \ge 0 \))\(S^n_{++}=\{X\in S^n\mid X\succ 0\}\):
positive definite\(n\times n\) matrices
property: $S_+^n$ is a proper cone
Generalized inequalities#
generalized inequality defined by a proper cone \(K\):
\(x \preceq_K y \Leftrightarrow y-x \in K\),
\(x \prec_K y \Leftrightarrow y-x \in int K\)
Example 3
componentwise inequality (\(K=R_+^n\))
matrix inequality (\(K=S_+^n\))
Hyperplane and halfspace#
Definition 8 (Hyperplane)
hyperplane: set of the form \(\{ x \mid a^T x = b \} (a\neq 0)\)
where \(a\) is the normal vector; hyperplanes are affine and convex
Definition 9 (Halfspace)
halfspace: set of the form \(\{ x \mid a^T x \le b \} (a\neq 0)\)
halfspaces are convex
Euclidean balls and epplipsoids#
Definition 10 (Euclidean ball)
(Euclidean) ball with ceneter \(x_c\) and radius \(r\):
\(B(x_c,r) = \{ x\mid || x - x_c ||_c \le r \} = \{ x_c + ru \mid ||u||_2 \le 1 \}\)
Definition 11 (Ellipsoid)
Ellipsoid: set of the form
\(x\mid (x-x_c)^T P^{-1} (x-x_c) \le 1\) with \(P\in S_{++}^n\).
Other representation: \(\{ x_c + Au \mid ||u||_2 \le 1 \}\) with \(A\) square and nonsingular (\(A=P^{1/2}\))
Norm#
Definition 12 (Norm)
norm: a function \(||\cdot ||\) that satisfies
\(||x||\ge 0\); \(||x||=0\) iff \(x=0\)
\(||tx||=|t| ||x||\) for \(t\in R\) (homogeneous)
\(||x+y||\le ||x|| + ||y||\) (triangle inequality)
Example 4 (Norm)
\(l_p\)-norm
\(l_1\)-norm
\(l_2\)-norm
\(l_\infty\)-norm
Polyhedron and polytope#
Definition 13 (Polyhedron)
Polyhedron is solution set of finitely many linear inequalities and equalities
\(P=\{ x \mid a_j^T x \le b_j, c_j^T x = d_j \}\)
polyhedron is intersection of finite number of halfspaces and hyperplanes
polyhedron is convex
polytopeis a bounded polyhedron
Methods for establishing convexity of a set#
apply definition
show that C is obtained from simple convex sets (hyperplanes, halfspaces, norm balls, … ) by operations that preserve convexity (intersection, affine functions, perspective function, linear-fractional functions)
Intersection#
If \(S_i\) is (affine, convex, convex cone), for \(i\in A\), then \(\cap_{i\in A} S_i\) is (affine, convex, convex cone)
intersection need not be finite: e.g., a convex set is intersection of infinite halfspaces
Affine function#
Affine function \(f: R^n \to R^m\), \(f(x) = Ax + b \) with \(A\in R^{m\times n}\), \(b\in R^{m}\)
if \(S\) is convex, \(f(S)\) is also convex
if \(C\) is convex, \(f^{-1}(C)\) is also convex
Example 5
The ellipsoid is the image of the unit ball under the affine mapping \(f(u)=P^{1/2} u + x_c\). It is also the inverse image of the unit ball under the affine mapping \(g(x) = P^{-1/2} (x-x_c)\).
Perspective function & linear-fractional function#
Definition 14 (perspective function)
perspective function \(P:R^{n+1} \to R^n\): \(P(x,t) = x/t, dom(P)=\{ (x,t)\mid t>0 \}\)
Definition 15 (linear-fractional function)
linear-fractional function \(P:R^{n} \to R^m\): \(f(x,t) = \frac{Ax+b}{c^T x + d}, dom(f)=\{ x\mid c^x x + d >0 \}\)
images and inverse images of convex set under these functions are convexperspective functions preserve line \to preserve convexity
Separating hyperplane theorem#
Theorem 3
if \(C\) and \(D\) are disjoint convex sets, then there exists \(a\neq 0\), \(b\) s.t. the hyperplane \(\{ x \mid a^T x = b \}\) separates \(C\) and \(D\) i.e., \(a^T x \le b\) for \(x\in C\), \(a^T x \ge b\) for \(x\in D\)
supporting hyperplane to set \(C\) at boundary point \(x_0\): \(\{ x \mid a^T x = a^T x_0 \}\)
where \(a\neq 0\) and \(a^T x \le a^T x_0 \forall x \in C\)
Theorem 4 (Supporting hyperplane theorem)
if \(C\) is convex, then there exists a supporting hyperplane at every boundary point of \(C\).
Convex functions#
Definition 16 (Convex function)
a function \(f:R^n\to R\) is convex if dom is a convex set and
\(f(\theta x+(1-\theta) y) \le \theta f(x) + (1-\theta) f(y), \forall x, y\in D, 0\le \theta \le 1\).
(strictly convex)
Example 6 (Convex functions)
convex:
affine
exponential
powers: \(x^\alpha\) on \(R_{++}\), for \(\alpha\ge 1\) or \(\alpha\le 0\)
powers of absolute value: \(\abs{x}^P\) on \(R\), for \(p\ge 1\)
negative entropy: \(x\log x\) on \(R_{++}\)
concave:
affine
powers: for \(0\le \alpha \le 1\)
logarithm: \(\log x\) on \(R_{++}\)
all norms are convex
Definition 17 (Epigraph of a function)
\(\alpha\)-sublevel set of \(f:R^n\to R\): \(C_{\alpha} = \{x\in dom f \mid f(x) \le \alpha \}\) sublevel sets of convex functiosn are convex
epigraph of \(f: R^n\to R\)
if all sublevel sets of a function are convex, is the function necessarily convex? (butterfly like function)
Theorem 5 (Epigraph convex)
\(f\) is convex iff \(epi(f)\) is a convex set.
Proof. two directions
Definition 18 (Differentiable functions)
\(f\) is differentiable if \(D\) is open and the gradient \(\nabla f(x)\) exists at each \(x\in D\)
f (defined on an open domain) is convex, then f is continuousf is convex, then f is differentiable. (False)
Definition 19 (1st-order condition)
differentiable \(f\) with convex domain is convex iff
first-order approximation of f is always underestimator. (first-order Taylor)
Definition 20 (second-order condition)
\(f\) is twice differentiable if \(D\) is open and the Hessian \(\nabla^2 f(x)\) exists at each \(x\in D\)
Theorem 6
for twice differentiable f with convex domain, f is convex iff \(\nabla^2 f(x) \succeq 0, \forall x \in D\)
if \(\nabla^2 f(x) \succ 0 \forall x\in D\), then \(f\) is strictly convex. (converse is not true, \(f(x) = x^4\))
Example 7 (Quadratic function)
\(f(x) = x^2\), \(f(x) = (1/2)x^T Px + q^T x +r \) with \(P\in S^n\), then \(\nabla (x) = Px+q\), \(\nabla^2 f(x) =P\). convex iff \(P \succeq 0\)
Example 8 (Least-square objective)
\(f(x)=||Ax-b||_2^2\), \(\nabla f(x)=2A^T(Ax-b)\), \(\nabla^2f(x) = 2A^T A\) convex (for any A)
Definition 21 (Global minimum)
if \(f\) is convex, differenetiable, and \(\nabla f(x*)=0\), then \(f(y)\ge f(x^*), \forall y\), i.e., \(x^*\) is the global minimum of \(f(x)\)
if f is twice differentiable and \(\nabla f(x*)=0\), \(\nabla^2 f(x*)\succeq 0\), then \(f(y)\ge f(x^*) \forall\), i.e., \(x^*\) is a global minimum of \(f(x)\).
a function may not have a global minimum even if it is convex, e.g., \(f(x)=x\), \(f(x)=e^x\)
these properties only apply to uncontrained minimization, \(f(x,y)=x^2+y^2\). if \(D=\{(x,y)| xy\ge 1, x\ge 0, y\ge 0\}\), then you cannot find minimum at \(\nabla f(x)=0\)
Definition 22 (Local minimum)
Local minimum: \(x^*\) is a local minimum of unconstrained function \(f\) if it is no worse than its neighbors, i.e., $\( \exists \epsilon >0, s.t., f(x^*)\le f(x), \forall x, ||x-x^*||_2\le \epsilon \)$ property:
local minimum (assume twice differentiable), then \(\nabla f(x*)=0\), \(\nabla^2 f(x*)\succeq 0\) (converse not True, \(f(x)=x^3\))
\(\nabla f(x*)=0\), \(\nabla^2 f(x*)\succ 0\), local minimum
Methods for establishing convexity of a set#
apply definition
show that \(C\) obtained from the operations:
intersection
affine
Perspective function & lienar-fractional function
Intersection#
If \(S_i\) is (affine, convex, convex cone), for \(i\in A\), then \(\cap_{i\in A} S_i\) is (affine, convex, convex cone)
intersection need not be finite: e.g., a convex set is intersection of infinite halfspaces. (Union not the case)
Affine function#
a general form of linear function (constant term)
Definition 23 (Affine function)
Affine function
\(f(x)=Ax + b, \; A\in R^{m\times n}, b\in R^{m}\)
If \(S\) is convex, \(f(S)\) is also convex; If \(C\) is convex, \(f^{-1}(C)\) is also convex;
Example 9 (Affine)
The epplisoid {prf:ref} is the image of the unit ball under the affine mapping.
It is also the inverse image of the unit ball under the affine mapping
Perspective function & lienar-fractional function#
perspective function \(P: R^{n+1} \to R^n\):
\(P(x,t) = x/t,\; dom(P)={(x,t)|t>0}\)
linear-fractional function \(f: R^n\to R^m\)
\(f(x) = \frac{Ax+b}{c^T x+d}, \; dom(f)={x|c^T x+d>0}\)
perspective functions preserve lines s.t. preserve convexity
\(P(\theta x +(1-\theta)y)=\dots=\mu P(x) + (1-\mu)P(y)\)
Theorem 7 (Separating hyperplane theorem)
If \(C\) and \(D\) are two disjoint (convex?) sets, there exists
Theorem 8 (Supporting hyperplane theorem)
supporint hyperplane to set \(C\) at boundary point \(x_0\):
\(\{x| a^T x = a^T x_0\}\) where \(a\neq 0\) and \(a^x \le a^T x_0, \forall x\in C\).
If \(C\) is convex, then there exists a supporitn hyperplane at every boundary point of \(C\)
Operations that preserve convexity#
nonnegative multiple
sum
composition with affine function
example:
log barrier for linear inequalities
piecewise-linear function
pointwise supremum: if \(f(x,y)\) is convex in \(x\) for each \(y\in A\), then \(g(x)=\sup_{y\in A} f(x,y)\) is convex
Remark 2
The infimum of a subset S of a partially ordered set P, assuming it exists, does not necessarily belong to S. If it does, it is a minimum or least element of S. Similarly, if the supremum of S belongs to S, it is a maximum or greatest element of S.
examples:
distance to farthes point in a set \(C\): \(f(x)=\sup_{y\in C} ||x-y||\)
minimization: if \(f(x,y)\) is convex in \((x,y)\) and \(C\) is a convex set, then \(g(x) = \inf_{y\in C} f(x,y)\) is convex.
composition of scalar functions#
composition of \(g\): \(R^n\to R\) and \(h: R\to R\): \(f(x)=h(g(x))\) \(f\) is convex if \(g\) convex, \(h\) convex, \(\tilde{h}\) nondecreasing
discover composition rules for \(n=1\), twice differentiable \(g, h\)
composition of \(g\): \(R^n\to R^k\) and \(h: R^k\to R\): \(f(x)=h(g(x))\) \(f\) is convex if \(g_i\) convex, \(h\) convex, \(\tilde{h}\) nondecreasing in each argument
composition rule for \(n=1\), \(f''(x) =g'(x)^T\nabla^2 h(g(x))g'(x) + \nabla h(g(x))^T g''(x)\)
the perspective of a function \(f:R^n\to R\) is the function \(g:R^n\times R\to R\),
$\(
g(x,t) = t f(x/t), \; dom(g) = \{ (x,t) | x/t \in dom(f), t>0 \}
\)\(
\)g\( is convex if \)f$ is convex.
Proof. For \(t>0\) we have \((x,t,s)\in epi(g) \Leftrightarrow tf(x/t)\le s \Leftrightarrow f(x/t) \le s/t \Leftrightarrow (x/t,s/t)\in epi(f)\) epi(g) is the inverse image of epi f under perspective mapping
f is convex implies epi f is convex implies epi g is convex implies g is convex.
Example 10
\(f(x) = x^T x\) is convex; hence \(g(x,t)=x^T x/t\) is convex for \(t>0\)
negative logarithm \(f(x) = -\log x\) is convex: hence relative entropy \(g(x,t) = t \log t - t \log x\) is convex on \(R_{++}^2\).
if \(f\) is convex, then $\(g()\)$
The conjugate of a function \(f\) is
\(f^*(y) = \sup_{x\in dom(f)} (y^T x - f(x))\)
(which means \(f^*(y)\) is the max gap between \(yx\) and \(f(x)\))
\(f^*\) is convex (even if \(f\) is not), Pointiwse supremum of a family of affine functions of y
Example 11
Affine function. \(f(x) = ax + b\). \(f^*(y)=\sup_x(yx-ax-b)= -b, y=a; \infty, otherwise\)
negative logarithm. \(f^*(y)=\sup_{x>0} (xy+\log x)\)
quasiconvex#
\(f:R^n\to R\) is quasiconvex if dom f is convex and the sublevel sets
\(C_\alpha = \{ x\in dom(f) | f(x) \le \alpha \}\)
is convex for all \(\alpha\).
quasilinear
Example 12
\(\sqrt{|x|}\) is quasiconvex on R
\(ceil(x) = \inf \{ z\in Z \mid z\ge x \}\) is quasilinear
\(\log x\) is quasilinear on \(R_{++}\)
\(f(x_1, x_2) = x_1 x_2\) is quasiconcave on \(R_{++}^2\)
linear-fractional function is quasilinear
properties
for quasiconvex \(f: 0\le \theta \le 1 \implies f(\theta x + (1-\theta)y)\le \max \{ f(x), f(y) \}\)
first-order condition: differentiable \(f\) with cvx domain is quasiconvex iff
\(f(y)\le f(x) \implies \nabla f(x)^T (y-x)\le 0\).
a positive function f is log-concave if \(\log f\) is concave:
\(f(\theta x + (1-\theta)y)\ge f(x)^\theta f(y)^{1-\theta}\) for \(0\le \theta \le 1\).
log-convex: f is log-convex if log f is convex.
example:
powers: \(x^a\) on \(R_{++}\) is log-convex for \(a\le 0\), log-concave for \(a\ge 0\).