Nowadays, the Interior-Point Methods (IPMs) have been proven to be effective for solving optimization problems. Because any differentiable convex quadratic problem can be formulated into a monotone Linear Complementarity Problem (LCP), i.e., $P_*(0)$ LCP[1], and a variant of an IPM's quality is measured by the fact whether it can be generalized to $P_*(\kappa)$ LCPs or not[2], $P_*(\kappa)$ LCPs become an active area of research.
In this paper, we consider the LCP as follows:
where $M\in R^{n\times n}$ is a $P_*(\kappa)$ matrix and $q\in R^n$.
It is well known that the primal-dual path following methods use the classical Newton direction as the search direction, which is closely related to the primal-dual logarithmic barrier function. Peng et al.[3] presented a new variant of primal-dual IPMs based on self-regular kernel functions, and provided the theoretical complexity $O(\sqrt{n}\log n\log\frac{n}{\varepsilon})$ for large-update IPMs. Bai et al.[4] proposed a new class of kernel functions which are neither logarithmic barrier nor self-regular, and they obtained the polynomial complexity for LO. He et al.[5] introduced a self-adjusting IPM for LCP based on a logarithmic barrier function and gave some numerical results. Recently, based on kernel functions, G.M.Cho[6] derived an iteration bound $O\big(q^{\frac{3}{2}}(1+2\kappa)\sqrt{n}(\log n)^{\frac{q+1}{q}}\log \frac{n}{\varepsilon}\big)$ for $P_*(\kappa)$ LCPs. Amini and Peyghami[7] obtained the iteration complexity $O\big((1+2\kappa)\sqrt{n}\log n\log(\log n)\log\frac{n}{\varepsilon} \big)$ in special case.
In general, each kernel function gives rise to a search direction, and determines a primal-dual interior-point algorithm. The barrier functions based on the kernel function can be regarded as a measure for the distance of the given point to the central path[8].In this paper, we focus on $P_*(\kappa)$ LCPs based on a new kernel function, design a large-update interior-point algorithm, and obtain the iteration complexity
which is better than the bounds of large-update primal-dual IPMs based on some other kernel functions. The proposed kernel function is strongly convex, and it is neither self-regular nor usual logarithmic kernel function. We also provide a simple analysis to get iteration complexity of this algorithm based on the new kernel function, which is different from the analysis of the other primal-dual IPMs based on logarithmic barrier functions and recent kernel functions introduced by some authors in optimization field.
This paper is organized as follows: in Section 2, we recall basic concepts and the notion of the central path. We describe the kernel function and its growth properties for $P_*(\kappa)$ LCPs in Section 3. The analysis of feasible step size and the amount of the decrease in proximity function during an inner iteration are reported in Section 4. In Section 5, we explore the total number of iterations for our algorithm. In section 6, we have some concluding remarks.
The following notations are used throughout the paper. For $x\in R^n $,
i.e., the minimal component of $x$. The vector norm is the 2-norm denoted by $\|\cdot\|$. We also denote the all one-vector of length $n$ by $e$. The diagonal matrix with the diagonal vector $x$ has been denoted by $X=\text{diag}(x)$. The index set $J$ is $J=\{1, 2, \cdot\cdot\cdot, n\} $. For $x, s\in R^n$, $ xs$ denotes the coordinate-wise product (Hadamard product), and $x^{T}$s denotes the scalar product. We say $f(t)=\Theta (g(t))$ if there exist some positive constants $\omega_1$ and $\omega_2$ such that $\omega_{1}g(t)\leq f(t)\leq \omega_{2}g(t)$ holds for all $t > 0$. Further, $f(t)=O(g(t))$ if there exists a positive constant $\omega$ such that $f(t)\leq \omega$ holds for all $t> 0$.
The $P_*(\kappa)$ matrix and $P_*(\kappa)$ LCP were introduced by Kojima et al.[1]. Here we give the definition of $P_*(\kappa)$ matrix and list some important properties related to its use in IPMs.
Definition 2.1 Let $\kappa \geq 0$ be a nonnegative number. A matrix $M\in R^{n\times n}$ is called a $P_*(\kappa)$ matrix if
for all $x\in R^n$, where $J_+(x)=\{i\in J:x_i(Mx)_i\geq 0\}$ and $J_-(x)=\{i\in J:x_i(Mx)_i<0\}$.
Note that for $\kappa=0$ the $P_*(0)$ are the class of positive semi-define matrices. This implies that the class of convex quadratic programs and LO problems are a subclass of $P_*(0)$ LCPs. In what follows, we give some definitions about convexity concepts which will be essential tools in our analysis.
Definition 2.2 A twice differentiable function $f:X(\subset R)\rightarrow R$ is strongly convex if and only if there exists $m_0>0$ such that $f''(x)\geq m_0$.
Definition 2.3 A function $f:X(\subset R)\rightarrow R$ is exponentially convex if and only if
Proposition 2.1 For $P_*(\kappa)$ matrix $M\in R^{n\times n}$, the matrix
is a nonsingular matrix for any positive diagonal matrices $X, S\in R^{n\times n}$[6].
Corollary 2.2 Let $M\in R^{n\times n}$ be a $P_*(\kappa)$ matrix and $x, s\in R^{n}_{++}.$ Then, for all $a\in R^n$ the system
has a unique solution $(\Delta x, \Delta s)$, where $X=\text{diag}(x)$ and $S=\text{diag}(x)$.
The basic idea of primal-dual IPMs is to replace the second equation in system (1.1), the complementarity condition, by the parameterized equation $xs=\mu e$, with $\mu>0$. Parameter $\mu$ is referred to as the central path parameter. This replace leads to the following system:
where $\mu>0$. Without loss of generality, we assume that the system (1.1) is strictly feasible, i.e., there exists $(x^0, s^0, \mu^0)>0$ such that $s^0=Mx^0+c$ and $x^0s^0=\mu^0 e$. And moreover, we have an initial strictly feasible point with $\Psi(x^0, s^0, \mu^0)\leq \tau$ for some $\mu^0>0$(see [1]). For this given strictly feasible point $(x^0, s^0)$, we can always find a $\mu^0>0$ such that $\Psi(x^0, s^0, \mu^0)\leq \tau$.
Since $M$ is a $P_*(\kappa)$ matrix and (1.1) is strictly feasible, then (2.1) has a unique solution for any $\mu>0$. We denote the solution of (2.1) as $(x(\mu), s(\mu))$, the $\mu$-center, for given $\mu>0$. The set of unique solutions $\{(x(\mu), s(\mu)):\mu>0\}$ of system (2.1) forms the so-called central path of (1.1). As $\mu\rightarrow 0$, the sequence $(x(\mu), s(\mu))$ approaches to the solution $(x, s)$ of (1.1). If $(x(\mu), s(\mu))$ is known for some positive $\mu$, then we decrease $\mu$ to $\mu:=(1-\theta)\mu$ for some fixed $\theta\in(0, 1)$ and solve the following system:
Using Corollary 2.2, the system (2.2) uniquely defines a search direction $(\Delta x, \Delta s)$, for any $(x, s)>0$. We define the following notations:
Then we have the scaled Newton-system
where $\bar{M}=DMD$ and $D=\text{diag}(d)$. The second equation in (2.4) is called the scaled centering equation.
We consider a strictly convex function $\Psi(\upsilon)$ which is minimal at $\upsilon=e$ and $\Psi(e)=0$. Then we replace the scaled centering equation in (2.4), by
Therefore we have the modified Newton-system as follows:
By Corollary 2.2, this system uniquely defines a search direction $(\Delta x, \Delta s)$.
Throughout the paper we assume that a threshold parameter $\tau$ and a fixed barrier update parameter $\theta$ are given, and assume $\tau=O(n)$, $0<\theta<1$. The algorithm works as follows. Starting with a strictly feasible point $(x, s)$ in a $\tau$-neighborhood of the current $\mu$-center, we decrease $\mu$ to $\mu_+:=(1-\theta)\mu$, for some fixed $\theta\in(0, 1)$. Then we solve the modified Newton-system (2.6) to obtain the unique search direction. The positivity condition of a new iterate is ensured with the right choice of the step size $\alpha$ which is defined by some line search rule. By repeating this process, we find a new iterate $(x_+, s_+)$ in a $\tau$-neighborhood of the $\mu_+$-center. Then we let $\mu=\mu_+$ and $(x, s)=(x_+, s_+)$. Again, $\mu$ is reduced by the factor $1-\theta$ and we solve the modified Newton-system targeting at the new $\mu_+$-center, and so on. This process is repeated until $\mu$ is small enough, e.g., $n\mu\leq \varepsilon$. Now, using the proximity function $\Psi(\upsilon)$ to find the search direction and to measure the proximity between the current iterates and the $\mu$-center. The generic form of this algorithm is shown as follows.
In Algorithm 1, each outer iteration is consisted of an $\mu$-update and a sequence of inner iterations. The total number of inner iterations is referred to as iteration complexity of the algorithm. Usually, if $\theta=\Theta(1)$, we call the algorithm a large-update method. If $\theta=\Theta({1\over\sqrt{n}})$, the algorithm is named a small -update method.
In this section we consider the univariate function $\psi(t):$
This function is strongly convex and not self-regular.
To simplify the analysis, we restrict ourselves to the case where the proximity function $\Psi(\upsilon)$ is separable with identical coordinate functions, i.e.,
We call the univariate function $\psi(t)$ the kernel function of the proximity function $\Psi(\upsilon)$. For $\psi(t)$, we have
It follows that $\psi(1)=\psi'(1)=0$, and $\psi''(t)\geq 1$, which shows that is indeed a strongly convex kernel function. It is also quite straightforward that
And due to $\psi(1)=\psi'(1)=0$, $\psi(t)$ is determined by its second derivative:
In the analysis of the algorithm, we also define the norm-based proximity measure $\delta(\upsilon)$ as follows:
Since $\Psi(\upsilon)$ is strongly convex and minimal at $\upsilon=e$, we have
In what follows, we give some key properties which are important in our analysis.
Lemma 3.1 $\psi(t)$ is exponentially convex, namely,
Proof This result follows easily by Lemma 1 in [3], which states that the above inequality holds if and only if $t\psi''(t)+\psi'(t)\geq 0$ for all $t>0$. Using simple calculus, we obtain
So this completes the proof of the lemma.
Lemma 3.2 Kernel function $\psi(t)$ satisfies the following properties:
(ⅰ) $ {{1}\over {2}}(t-1)^2\leq \psi(t)\leq {{1}\over {2}}\psi'(t)^2$, for all $t>0$.
(ⅱ) $\Psi(\upsilon)\leq 2\delta(\upsilon)^2$.
(ⅲ) $\|\upsilon\|\leq\sqrt{n}+\sqrt{2\Psi(\upsilon)}\leq\sqrt{n}+2\delta(\upsilon)$.
Proof Using (3.4) and $\psi''(t)\geq 1$, for all $t>0$, one can easily obtain that
which proves the first inequality. Then
This completes the proof of (ⅰ). In order to prove (ⅱ), using the second inequality of (ⅰ) and (3.5), we have
which proves (ⅱ). Now, using the first inequality of (ⅰ), we have
which implies that $\|\upsilon\|\leq\sqrt{n}+\sqrt{2\Psi(\upsilon)}\leq\sqrt{n}+2\delta(\upsilon)$.
At the start of any outer iteration of the algorithm, just before the update of $\mu$ with the factor $1-\theta$, we have $\Psi(\upsilon)\leq \tau$. Due to the update of $\mu$ the vector $\upsilon$ is divided by the factor $\sqrt{1-\theta}$, with $0<\theta<1$, which in general leads to an increase in the value of $\Psi(\upsilon)$. Then, during the subsequent inner iterations, $\Psi(\upsilon)$ decreases until it passes the threshold $\tau$ again. Hence, during the course of the algorithm the largest values of $\Psi(\upsilon)$ occur just after the updates of $\mu$. In next two lemmas we investigate the growth behavior of $\Psi(\upsilon)$ after $\mu$-update.
Lemma 3.3 Let $\beta\geq 1$. Then $\psi(\beta t)\leq\psi(t)+{1\over 2}(\beta^2-1)t^2$.
Proof Using the definition of $\psi(t)$, we may write
Defining $h(\beta)=(\beta t-1)e^{{1\over \beta t}-1}$, we have $h(1)=(t-1)e^{{1\over t}-1}$ and
Thus, we have
which implies the lemma.
Lemma 3.4 Let $0<\theta<1$ and $\upsilon_+={\upsilon\over\sqrt{1-\theta}}$. Then
Proof Using Lemma 3.3, with $\beta={1\over\sqrt{1-\theta}}$, we have
Using Lemma 3.2 (ⅲ), we obtain
This completes the proof.
In this section, we evaluate the feasible step size $\alpha$ and obtain an upper bound for the decrease of the proximity function. After a damped step, with step size $\alpha$, we define
Using (2.3), we have
Now, assuming the step size $\alpha$ is such that the coordinates of the vectors $\upsilon+\alpha dx$ and $\upsilon+\alpha ds$ are positive. And using Lemma 3.3, we have
Defining
then $f(\alpha)\leq f_1(\alpha)$. Note that $f(0)=f_1(0)=0$.
By taking the first derivative of $f_1(\alpha)$ with respect to $\alpha$, we obtain
From (2.5) and (3.5), we have
By taking the second derivative of $f_1'(\alpha)$ with respect to $\alpha$, we know
Since $M$ is a $P_*(\kappa)$ matrix and $M\Delta x=\Delta s$ from (2.6), for $\Delta x, \Delta s\in R^n$, we have
where $J_+(x)=\{i\in J:\Delta x_i\Delta s_i\geq 0\}$ and $J_-(x)=J-J_+(x)$. Due to
For notional convenience, we define the follow notations:
To compute the bound of $\|dx\|$ and $\|ds\|$, we need the following technical lemma.
Lemma 4.1 One has $\sigma_+\leq\delta^2$ and $\sigma_-\leq(1+4\kappa)\delta^2$.
Proof From the definition of $\sigma_+$, $\sigma_-$ and $\delta$, we have
From (4.4), we also have
To compute an upper bound for the decrease of the proximity function during an iteration, we need the following technical lemmas.
Lemma 4.2(Lemma 4.4 in[9]) $\|dx\|\leq 2\sqrt{1+2\kappa}\delta$ and $\|ds\|\leq 2\sqrt{1+2\kappa}\delta$.
Lemma 4.3(Lemma 4.5 in[9]) $f_1''(\alpha)\leq2(1+2\kappa)\delta^2\psi''(\upsilon_\text{min}-2\alpha\sqrt{1+2\kappa}\delta)$.
Lemma 4.4(Lemma 4.6 in[9]) $f_1'(\alpha)\leq 0$ if $\alpha$ satisfies the following inequality
Note that for $t>0$, $\psi''(t)\geq 1$, so the function $-{1\over 2}\psi'(t)$ has the inverse function. Assume that $\rho:[0, \infty)\rightarrow(0, 1]$ is the inverse function of the restriction of $-{1\over 2}\psi'(t)$ to the interval $(0, 1]$. We want to estimate the step size $\alpha$ as large as possible. The derivative of the left hand in (4.5) with respect $\alpha$ is $2\sqrt{1+2\kappa}\delta\psi''(\upsilon_\text{min}-2\alpha\sqrt{1+2\kappa}\delta)>0$, since $\psi''(t)>0$. So the left hand in (4.5) is monotone increasing in $\alpha$. So the largest possible value of $\alpha$ satisfying (4.5) occurs when
The derivative of the left hand in (4.6) with respect $\upsilon_\text{min}$ is
since $\psi'''(t)<0$, the derivative of the left hand in (4.6) is decreasing in $\upsilon_\text{min}$. This implies that for fixed $\delta$, $\alpha$ gets smaller if $\upsilon_\text{min}$ gets smaller.
By the definition of $\delta$ and $\Psi(\upsilon)$, we have
and the equality holds if and only if $\upsilon_\text{min}$ is the only coordinate in $\upsilon$ which is different from 1 and $\upsilon_\text{min}\leq 1$, i.e. $\psi'(\upsilon_\text{min})\leq 0$. Hence, the worst situation for the step size occurs when $\upsilon_\text{min}$ satisfies
In this case by (4.7) and the definition of $\rho$, we have $\upsilon_\text{min}=\rho(\delta)$.
By (4.6) and (4.7),
Then by (4.8) and the definition of $\rho$, we have
So by $\upsilon_\text{min}=\rho(\delta)$, the largest step size $\alpha$ is given as follows:
Now we want to compute the lower bound for $\bar{\alpha}$.
Lemma 4.5 Let $\rho$ and $\bar{\alpha}$ be as defined, then we have
Proof Using the definition of $\rho$, $-\psi'(\rho(\delta))=2\delta$. Taking the derivation with respect to $\delta$, we obtain
Thus, $\rho$ is monotonically decreasing. By substituting this equality in (4.9), we have
Since $\delta\leq\xi\leq\delta+{\delta\over\sqrt{1+2\kappa}}$ and $\rho$ is monotonically decreasing, $\rho(\xi)\geq\psi''(\rho(\delta+{\delta\over\sqrt{1+2\kappa}}))$. Since $\psi''(t)$ is monotonically decreasing, $\psi''(\rho(\xi))\leq\psi''(\rho(\delta+{\delta\over\sqrt{1+2\kappa}}))$. Therefore, we have the following lower bound for $\bar{\alpha}$:
and we use $\tilde{\alpha}$ as the default step size in our algorithm. By Lemma 4.5, $\tilde{\alpha}\leq\bar{\alpha}$.
In the following, we want to estimate the value of $f(\tilde{\alpha})$. For this purpose, we cite the following technical lemma in [3] without proof.
Lemma 4.6 Let $h(t)$ be a twice differentiable convex function with $h(0)=0, h'(0)<0$ and let $h(t)$ attains its (global) minimum at $t^{*}>0$. If $h''(t)$ is increasing for $t\in[0, t^{*}]$, then
Now, we can present an upper bound for $f(\tilde{\alpha})$.
Lemma 4.7(Lemma 5.2 in [7]) If the step size $\alpha$ is such that $\tilde{\alpha}\leq\bar{\alpha}$, then we have
Lemma 4.8 Let $\rho$ be the inverse function of the restriction of $-{1\over 2}\psi'(t)$ to the interval $(0, 1]$ and $\tilde{\alpha}$ be defined as in (4.11). Then
Proof The result follows immediately if we apply Lemma 4.7 to the default step size $\tilde{\alpha}$.
Now, we apply the so far obtained results to our proximity function. To this end, we need to compute $s=\rho(\delta+{\delta\over\sqrt{1+2\kappa}})$. Note that $\rho$, $-\psi'(s)=2(\delta+{\delta\over\sqrt{1+2\kappa}})$. So from (3.3) and (4.11),
Thus,
Since $0\leq s\leq 1$, ${{s^2}\over{s^2-s+1}}\leq 1$, we have
Therefore, we conclude that
Now, using Lemma 4.7, we know
Since the right side of (4.15) is monotonically decreasing in $\delta$, by using Lemma 3.2 (ⅱ), we can express the decrease in terms of $\Psi=\Psi(\upsilon)$ as follows:
where the last equality is obtained from the definition of $\Theta$ and the fact that during an inner iteration in Algorithm 1, $\Psi_0\geq\Psi(\upsilon)\geq\tau\geq 1$.
After the update of $\mu$ to $(1-\theta)\mu$, by Lemma 3.4 we have
At the start of an outer iteration, we have $\Psi(\upsilon)\leq \tau$. We assume that $\tau=O(n)$, and for the large update method, we have $\theta=\Theta(1)$. We need to compute how many inner iterations are required to return to the situation where $\Psi(\upsilon)\leq \tau$ after $\mu$-update. We denote the value of $\Psi(\upsilon)$ after the $\mu$-update as $\Psi_0$, and the subsequent values in the same outer iteration are denoted as $\Psi_i, i=1, 2, \cdot\cdot\cdot, K$, with $K$ denoting the total number of inner iterations in an outer iteration. Then we have
We need the following technical result (Lemma 14 in [3]) to get the iteration bound.
Lemma 5.1 Let $t_0, t_1, \cdot\cdot\cdot, t_K$ be a sequence of positive numbers such that
where $\beta>0$ and $0<\gamma\leq 1$. Then $K\leq \lceil t_0^\gamma /\beta\gamma\rceil$.
Lemma 5.2 Let $K$ denotes the total number of inner iterations in an outer iteration. Then we have
Proof From (4.16), we have
Letting
using Lemma 5.1 we can obtain that
Theorem 5.1 Given that $\tau=O(n), $ and $\theta=\Theta(1)$, which are characteristics of the large-update methods, Algorithm 1 has at most $O\big((1+2\kappa) \sqrt{n}(\log n)^{2}\log\frac{n}{\varepsilon}\big)$ iterations.
Proof It is well know that the number of our iterations is bounded[9] by $\frac{1}{\theta}\log\frac{n}{\varepsilon}.$ Multiplying this number and the upper bound for the number inner iterations per outer iteration, we can get an upper bound for the total number of iterations, and using that $\tau=O(n), $ and $\theta=\Theta(1)$, then the iteration complexity is given by
In this paper, we introduced a new kernel function, which has simple algebraic expression and has not been used before to our knowledge. Based on the new kernel function, we proposed a polynomial interior-point algorithm for $P_*(\kappa)$ LCPs and showed that this large-update method for solving $P_*(\kappa)$ LCPs has the polynomial complexity $O\big((1+2\kappa) \sqrt{n}(\log n)^{2}\log\frac{n}{\varepsilon}\big)$. Our result was better than the bounds of large-update primal-dual IPMs based on some other kernel functions. Our analysis to get iteration complexity of this algorithm based on the new kernel function was also different from the analysis of the other primal-dual IPMs based on logarithmic barrier functions and recent kernel functions.