数学杂志  2016, Vol. 36 Issue (4): 794-808   PDF    
扩展功能
加入收藏夹
复制引文信息
加入引用管理器
Email Alert
RSS
本文作者相关文章
周光辉
张从军
张成虎
王月虎
非线性互补问题的两种数值解法
周光辉1, 张从军2, 张成虎2, 王月虎3     
1. 淮北师范大学数学科学学院; 信息学院, 安徽 淮北 235000;
2. 南京财经大学应用数学学院, 江苏 南京 210023;
3. 南京财经大学管理科学与工程学院, 江苏 南京 210023
摘要:本文研究了非线性互补问题的两类数值求解方法.在经典LQP算法及Levenberg-Marquardt算法的基础上, 构造了两种新算法, 并证明了这两种新算法的收敛性.数值实验表明, 新算法对测试问题优于已有算法.
关键词非线性互补问题    LQP算法    Levenberg-Marquardt算法    
TWO CLASSES OF NUMERICAL METHODS FOR THE NONLINEAR COMPLEMENTARITY PROBLEM
ZHOU Guang-hui1, ZHANG Cong-jun2, ZHANG Cheng-hu2, WANG Yue-hu3     
1. School of Mathematical Sciences; Information College, Huaibei Normal University, Huaibei 235000, China;
2. School of Applied Mathematics, Nanjing University of Finance and Economics, Nanjing 210023, China;
3. School of Management Science and Engineering, Nanjing University of Finance and Economics, Nanjing 210023, China
Abstract: In this paper, two numerical methods are proposed for the nonlinear complemen-tarity problems. Based on the classical LQP algorithm and the Levenberg-Marquardt algorithm, the new algorithms are designed; the convergence properties of our methods are obtained; the numerical experiments show that our new methods are better than traditional ones for some problems.
Key words: nonlinear complementarity problem     LQP algorithm     Levenberg-Marquardt algorithm    
1 引言

考虑非线性互补问题${\rm NCP}\left( F \right)$:寻找一个向量$x\in R^n$, 使

$ x\ge 0, \quad F\left( x \right)\ge 0, \quad x^TF\left( x \right)=0, $ (1.1)

其中$F:R^n\to R^n$是一个非线性映射.

一个${\rm NCP}\left(F\right)$可转化为寻找一个极大单调算子$T$的零点问题.由Martinet[1]提出的邻近点算法, 将极大单调算子的零点问题转化为一个极大包含问题.将极大单调包含问题中的线性项用非线性项代替, 使上述问题转化为非线性方程组的问题, 此类算法称为LQP算法.

He[2]、Bnouhachem [3]、Noor和Bnouhachem [4]、Xu [5]引进了基于"预测校正方法"的LQP算法, 能很好的求解上述非线性方程组问题.

预测校正方法, 在每次迭代过程中包括两个步骤:预测步与校正步.预测子(predictor)是通过在一个误差准则下近似求解LQP方程组获得的; 校正步, 亦称迭代步(new iterate), 一般是由一个投影算子表示的.

例如Noor和Bnouhachem在文献[4]中, 提出了如下算法:

步骤1  寻找近似解$\tilde {x}^k$, 使得

$ 0\approx \beta _k F(\tilde {x}^k)+\tilde {x}^k-x^k+\mu X_k \log \frac{\tilde {x}^k}{x^k}=:\xi ^k, $

其中$\left\| {\xi ^k} \right\|\le \eta \left\| {x^k-\tilde {x}^k} \right\|$, $\eta \in \left( {0, 1} \right).$

步骤2  对$\alpha >0$, $x^{k+1}\left( \alpha \right)$是下面方程组的正数解:

$ \left( {\frac{1-\mu }{1+\mu }} \right)\alpha \beta _k F(\tilde {x}^k)+x-x^k+\mu X_k \log \frac{x}{x^k}=0. $

Yuan在文献[6]中提出了以下算法:

步骤1  寻找$\tilde {x}^k\in R_+^n $, 满足$0\approx \beta_k F(\tilde {x}^k)+\tilde {x}^k-\left( {1-\mu } \right)x^k-\mu X_k^2 \left( {\tilde {x}^k} \right)^{-1}=:\xi ^k, $其中$\xi^k$满足如下误差准则:

$ \left\| {\xi ^k} \right\|\le \eta \sqrt {1-\mu ^2} \left\| {x^k-\tilde {x}^k} \right\|. $

步骤2  令$x^{k+1}\left( {\alpha _k } \right)=P_{R_+^n } \left[{x^k-\alpha _k \frac{\beta _k }{1+\mu }F\left( {\tilde {x}^k} \right)} \right]$.

受文献[4, 6]所提算法的启发, 本文提出了一种基于预测校正方法的LQP算法.在新算法中设置了两个预测步, 并使用$x^k$与投影算子的凸组合构成算法的校正步.

此外, 可借助非线性互补函数, 将${\rm NCP}\left( F \right)$转化为非线性方程组问题:

$ \Phi \left( x \right)=0, $ (1.2)

其中$\Phi :R^n\to R^m$是一个非线性映射.

Levenberg-Marquardt算法可以看作是一种非线性最小二乘法, 是用于求解上述非线性方程组问题的一种改进的Gauss-Newton算法.由于此算法在优化问题中的广泛应用, 对于该算法的研究成果颇丰, 例如可参见文献[7-11].

Levenberg-Marquardt算法在每个迭代点处的搜索方向$d_L^k$通过如下线性方程组获得

$ \left( {\Phi'\left( {x^k} \right)^T\Phi'\left( {x^k} \right)+\mu _k I}\right)d=-\Phi'\left( {x^k} \right)^T\Phi \left( {x^k} \right), $ (1.3)

其中$\Phi'\left( x \right)$表示$\Phi \left( x \right)$的Jacobian, $\mu_k >0$是一个参数.参数$\mu _k$的引入克服了Gauss-Newton算法中要求矩阵$\Phi'\left( {x^\ast } \right)$满秩的困难. Yamashita和Fukushima [12]、Dan [13]等学者提出了该参数的更新准则: $\mu _k =\left\| {\Phi \left( {x^k} \right)} \right\|^\delta $, 其中$\delta \in \left( {0, \left. 2 \right]} \right.$.本文将沿用这一更新准则.

值得注意的是, 由于$\Phi'\left( {x^k} \right)^T\Phi'\left({x^k}\right)$的半正定性, 正参数$\mu _k$的引入使得搜索方向(试探步)$d_L^k$远离了矩阵广义逆下的迭代步$d_{MP}^k =-\Phi'\left( {x^k} \right)^+\Phi\left( {x^k} \right)$.所以本文将考虑将下式的解$d_G^k$作为修正的搜索方向:

$ \left( {\Phi'\left( {x^k} \right)^T\Phi'\left( {x^k} \right)+\mu _k I} \right)d_G^k =-\Phi'\left( {x^k} \right)^T\Phi \left( {x^k} \right)+\mu _k d_L^k . $ (1.4)

这样$d_G^k $很可能比$d_L^k $更靠近$d_{MP}^k.$

另外, 为了保证此类非线性方程组算法的全局收敛性, 通常需要假设$F$$P_0$函数, 并且在证明其超线性收敛性时, 通常需要一些非奇异性和严格互补条件的假设, 参见文献[2, 12].

考虑到以上两点, 本文给出了一种改进的Levenberg-Marquardt算法.该算法的优点在于搜索方向经过修正能更接近Moore-Penrose步, 并且其全局收敛性的证明不需要$F$$P_0$函数的假设.

本文接下来的部分, 分别给出一种新的LQP算法和Levenberg-Marquardt算法, 阐述了算法中参数的选取方法, 给出了收敛性定理, 通过数值实验说明了新算法相对已有算法的优劣, 表明了算法的有效性.

2 非线性互补问题的LQP算法
2.1 LQP算法的构造

算法2.1

步骤0  设置参数$\beta _0 >0$, $\varepsilon >0$, $\mu \in \left( {0, 1}\right)$, $\delta \in \left( {1, 2} \right)$, $\sigma \in \left( {1, 2}\right)$, $\eta \in \left( {0, 1} \right)$, $\tau \in \left( {0, \left. 1\right]} \right.$.初始点为$x^0>0$.

步骤1  如果$\left\| {\min\left\{ {x^k, F\left( {x^k} \right)} \right\}}\right\|_\infty \le \varepsilon $, 算法终止.否则, 转入步骤2.

步骤2  (预测步1) 寻找$\tilde {x}^k\in R_+^n $, 满足

$ 0\approx \beta _k F(\tilde {x}^k)+\tilde {x}^k-x^k+\mu X_k \log \frac{\tilde{x}^k}{x^k}=:\xi ^k, $ (2.1)

其中$\xi ^k$满足如下非精确误差准则:

$ \left\| {\xi ^k} \right\|\le \eta \sqrt {1-\mu ^2} \left\| {x^k-\tilde {x}^k} \right\|. $ (2.2)

计算$\xi ^k:=\beta _k \left( {F\left( {\tilde {x}^k} \right)-F\left( {x^k} \right)} \right)$, $r=\frac{\left\| {\xi ^k} \right\|}{\sqrt {1-\mu ^2} \left\| {x^k-\tilde {x}^k} \right\|}$.如果$r>\eta $, 令$\beta _k =\beta _k \ast \frac{0.8}{r}, $重复步骤2;否则, 转入步骤3.

步骤3  (预测步2) 计算最优参数$\alpha _k$.然后, 寻找如下方程组的正数解$\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k$:

$ \left( {\frac{1-\mu }{1+\mu }} \right)\alpha \beta _k F(\tilde {x}^k)+x-x^k+\mu X_k \log \frac{\tilde {x}^k}{x^k}=0. $ (2.3)

步骤4  (校正步)计算最优参数$\gamma _k $.然后, 生成新的迭代点$x^{k+1}$:

$ x^{k+1}(\gamma )=\tau x^k+(1-\tau )P_{R_+^n } \left[{x^k-\gamma (x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k)} \right]. $ (2.4)

步骤5  如果$r\le 0.5$, 令$\beta _{k+1} =\frac{\beta _k \ast 0.7}{r}$; 否则, 令$\beta _{k+1} =\beta _k $.

步骤6  令$k=k+1$, 转入步骤1.

注2.1  算法2.1中包括了两个预测步和一个校正步.校正步由一个凸组合表示, 见式(2.4) 所示.此外, 算法中对于参数$\alpha$$\gamma $的确定是本节研究的重点.

2.2 参数的确定及收敛性分析

$\Pi (\alpha )=\left\| {x^k-x^\ast } \right\|^2-\left\| {\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k(\alpha )-x^\ast } \right\|^2$, $\Psi (\gamma )=\left\|{x^k-x^\ast } \right\|^2-\left\| {x^{k+1}(\gamma )-x^\ast }\right\|^2.$

下面说明如何确定合适的参数$\alpha $$\gamma$, 同时给出算法2.1的收敛性定理.

首先给出两个常用结论.

引理2.1[14]  假设$P_{R_+^n}(.)$表示欧式范数下的投影, 即$P_{R_+^n } (y)=\min \left\{{\left\| {y-x}\right\|, x\in R_+^n } \right\}$, 那么对任意的$u\in R_+^n $$v\in R_+^n$, 有下面两式成立

$ \;\;\left\langle {v-P_{R_+^n } (v), P_{R_+^n } -u} \right\rangle \ge 0, \\\;\; \left\| {P_{R_+^n } (v)-u} \right\|^2\le \left\| {v-u} \right\|^2-\left\| {v-P_{R_+^n } (v)} \right\|^2. $

引理2.2[4]  对给定的$x^k>0$, $q\in R^n$, 假定$x$是如下方程组的一个正数解

$ q+x-x^k+\mu X_k \log \frac{x}{x^k}=0, $

其中$X_K =\mbox{diag}(x_1^k, \cdots, x_n^k )$, $\log \frac{x}{x^k}=(\log \frac{x_1}{x_1^k }, \cdots, \log \frac{x_n}{x_n^k })^T$, 那么对任意的$y\ge 0$, 有下式成立

$\left\langle {x-y, -q} \right\rangle \ge \frac{1+\mu }{2}\left(\left\| {x-y}\right\|^2-\left\| {x^k-y} \right\|^2\right)+\frac{1-\mu }{2}\left\| {x^k-x}\right\|^2. $

下面的定理用来说明如何确定参数$\alpha $.

定理2.1[4]  假设对任意的$x^\ast \in \Omega^\ast $, $\alpha>0$, $\tilde{x}^k$$\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k$是由算法2.1生成的, 则有下式成立:

$ \Pi (\alpha )\ge \frac{1-\mu }{1+\mu }\Phi (\alpha ), $ (2.5)

其中

$ \Phi (\alpha )=2\alpha \phi _k -\alpha ^2\left\| {d_k } \right\|^2, $ (2.6)
$ d_k =(x^k-\tilde {x}^k)+\frac{1}{1+\mu }\xi ^k, $ (2.7)
$ \phi _k =\frac{1}{1+\mu }\left\| {x^k-\tilde {x}^k} \right\|^2+\frac{1}{1+\mu }\left\langle {x^k-\tilde {x}^k, \xi ^k} \right\rangle. $ (2.8)

注2.2  通过选取合适的$\alpha $极大化$\Phi (\alpha)$, 使每次迭代产生的迭代量$\Pi (\alpha )$极大化.由式(2.6) 可知$\Phi(\alpha )$是关于$\alpha $的二次函数, 则极大值点位于

$ \alpha _k^\ast =\frac{\phi _k }{\left\| {d_k } \right\|^2}, $ (2.9)

此时

$ \Phi (\alpha )=\alpha ^\ast \phi _k . $ (2.10)

引理2.3  假设$x^\ast \in \Omega ^\ast $, $\tilde{x}^k$$\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k$是由算法2.1生成的, 那么就有下式成立:

$ \Phi (\alpha _k )\ge \frac{1+\mu ^2-\eta ^2}{4(1+\mu )}\left\| {x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right\|^2. $ (2.11)

  一方面, 由式(2.2)、式(2.8)、柯西-施瓦茨不等式及$\mu \in(0, 1)$

$ \phi _k =\frac{1}{1+\mu }\left\| {x^k-\tilde {x}^k}\right\|^2+\frac{1}{1+\mu }\left\langle {x^k-\tilde {x}^k, \xi ^k} \right\rangle \notag \\\;\;\;\; \ge \frac{1}{1+\mu }\left\| {x^k-\tilde {x}^k} \right\|^2-\frac{1}{1+\mu }\left\| {x^k-\tilde {x}^k} \right\|\left\| {\xi ^k} \right\| \notag \\\;\;\;\; \ge \frac{1}{1+\mu }\left( {\left\| {x^k-\tilde {x}^k} \right\|^2-\frac{1-\mu ^2}{2}\left\| {x^k-\tilde {x}^k} \right\|^2-\frac{1}{2(1-\mu ^2)}\left\| {\xi ^k} \right\|^2} \right) \notag \\\;\;\;\; \ge \frac{1}{1+\mu }\frac{1+\mu ^2-\eta ^2}{2}\left\| {x^k-\tilde {x}^k} \right\|^2 \notag \\\;\;\;\; =\frac{1+\mu ^2-\eta ^2}{2(1+\mu )}\left\| {x^k-\tilde {x}^k} \right\|^2 . $ (2.12)

另一方面, 由式(2.2)、式(2.8) 有

$ \phi _k =\frac{1}{1+\mu }\left\| {x^k-\tilde {x}^k} \right\|^2+\frac{1}{1+\mu }\left\langle {x^k-\tilde {x}^k, \xi ^k} \right\rangle \\\;\;\;\; =\frac{1}{2}\left[{\left\| {x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right\|^2+\frac{1-\mu }{1+\mu }\left\| {x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right\|^2+\frac{2}{1+\mu }\left\langle {x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k, \xi ^k} \right\rangle } \right] \\\;\;\;\; >\frac{1}{2}\left[{\left\| {x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right\|^2+\frac{1}{(1+\mu )^2}\left\| {\xi ^k} \right\|^2+\frac{2}{1+\mu }\left\langle {x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k, \xi ^k} \right\rangle } \right]. $

由于$\left\| {a+b} \right\|^2=\left\| a \right\|^2+\left\| b \right\|^2+2\left\langle {a, b} \right\rangle $$d_k=(x^k-\tilde{x}^k)+\frac{1}{1+\mu }\xi ^k$, 则有$\phi _k \ge \frac{1}{2}\left\| {d{ }_k}\right\|^2$.再由式(2.9), 有

$ \alpha _k^* >\frac{1}{2}. $ (2.13)

由式(2.10)、式(2.12) 及式(2.13) 知, 式(2.14) 成立.证毕.

注2.3  为了加快收敛速度, 用松弛因子$\delta \in (1, 2)$乘以$\alpha_k^\ast $, 得到$\alpha _k $.即

$ \alpha _k =\delta \alpha _k^\ast =\delta \frac{\left\| {x^k-\tilde {x}^k}\right\|^2+\left\langle {x^k-\tilde {x}^k, \xi ^k} \right\rangle }{\left({1+\mu } \right)\left\| {(x^k-\tilde {x}^k)+\frac{1}{1+\mu }\xi ^k}\right\|^2}. $

此时

$ \Phi \left( {\alpha _k } \right) =\Phi \left( {\delta \alpha _k^\ast }\right) =\left( {2\delta -\delta ^2} \right)\Phi \left( {\alpha _k^\ast } \right) \ge \frac{\delta \left( {2-\delta } \right)\left( {1+\mu ^2-\eta ^2}\right)}{4(1+\mu )}\left\| {x^k-\tilde {x}^k} \right\|^2. $ (2.14)

由定理2.1中式(2.5), 立有下式成立

$ \Pi \left( {\alpha _k } \right) =\Pi \left( {\delta \alpha _k^\ast } \right) =\left\| {x^k-x^\ast } \right\|^2-\left\|{\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}}^k\left( {\delta \alpha _k^\ast } \right)-x^\ast } \right\|^2\\\;\;\;\;\;\;\;\;\;\;\; \ge \frac{1-\mu }{1+\mu }\Phi \left( {\delta \alpha _k^\ast } \right) \ge \frac{\delta \left( {2-\delta } \right)\left( {1-\mu } \right)\left({1+\mu ^2-\eta ^2} \right)}{4(1+\mu )^2}\left\| {x^k-\tilde {x}^k}\right\|^2 . $

采用类似于定理2.1的思想, 给出定理2.2, 分析参数$\gamma$的选取方法.为描述简单, 令$x_\ast ^k \left(\gamma \right):=P_{R_+^n }\left[{x^k-\gamma (x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k)} \right].$

定理2.2  假设$x^\ast \in \Omega ^\ast $, 且$\tilde{x}^k$$\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k$是由算法2.1生成的, 则有$\Psi \left( \gamma \right)\ge (1-\tau )\Theta (\gamma ), $其中

$ \Theta (\gamma )=\gamma \left[{\left\| {x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right\|^2+\left\| {x^k-x^\ast } \right\|^2-\left\| {\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k-x^\ast } \right\|^2} \right]-\gamma ^2\left\| {x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right\|^2. $ (2.15)

  由$\left\| {a+b} \right\|^2=\left\| a \right\|^2+\left\| b \right\|^2+2\left\langle {a, b} \right\rangle$$\left\langle {a+b, b}\right\rangle =\left\langle {a, b}\right\rangle +\left\| b \right\|^2$, 有下面两式成立

$ 2\left\langle {x^k-x_\ast ^k \left( \gamma \right), x_\ast ^k \left( \gamma \right)-x^\ast } \right\rangle =\left\| {x^k-x^\ast } \right\|^2-\left\| {x^k-x_\ast ^k \left( \gamma \right)} \right\|^2+\left\| {x_\ast ^k \left( \gamma \right)-x^\ast } \right\|^2, \\ 2\left\langle {x^\ast -\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k, x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right\rangle =\left\| {x^\ast -\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right\|^2-\left\| {x^\ast -x^k} \right\|^2+\left\| {x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right\|^2 . $

由引理2.1, 有

$ \left\| {x_\ast ^k \left( \gamma \right)-x^\ast } \right\|^2\le \left\| {x^k-\gamma \left( {x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right)-x^\ast } \right\|^2-\left\| {x^k-\gamma \left( {x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right)-x_\ast ^k \left( \gamma \right)} \right\|^2. $

注意到$2\left\langle {a+b, b} \right\rangle =\left\| {a+b} \right\|^2-\left\|a \right\|^2+\left\| b \right\|^2$$\tau \in \left( {0, 1} \right)$, 有

$ \left\| {x^{k+1}\left( \gamma \right)-x^\ast } \right\|^2 =\left\| {\tau\left( {x^k-x^\ast } \right)+\tau \left( {1-\tau } \right)\left( {x_\ast ^k\left( \gamma \right)-x^\ast } \right)} \right\|^2 \\\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; =\tau \left\| {x^k-x^\ast } \right\|^2+\left( {1-\tau } \right)\left\|{\left( {x_\ast ^k \left( \gamma \right)-x^\ast } \right)} \right\|^2-\tau\left( {1-\tau } \right)\left\| {x^k-x_\ast ^k \left( \gamma \right)}\right\|^2 \\\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; \le \tau \left\| {x^k-x^\ast } \right\|^2+\left( {1-\tau } \right)\left[{\left\| {x^k-\gamma \left({x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right)-x^\ast } \right\|^2}\right. \\\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; \left. {-\left\| {x^k-\gamma \left({x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right)-x_\ast ^k \left( \gamma \right)} \right\|^2-\tau \left\| {x^k-x_\ast ^k \left( \gamma \right)} \right\|^2} \right], $

那么

$ \left\| {x^{k+1}\left( \gamma \right)-x^\ast } \right\|^2 =\left\| {x^k-x^\ast } \right\|^2-\left( {1-\tau } \right)\left[{\left\|{x^k-x_\ast ^k \left( \gamma \right)-\gamma \left({x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right)} \right\|^2} \right. \notag \\\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; \qquad \left. {+\tau \left\| {x^k-x_\ast ^k \left(\gamma \right)} \right\|^2-\gamma ^2\left\|{x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right\|^2+2\gamma \left\langle {x^k-x^\ast, x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right\rangle } \right] \notag \\\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; \le \left\| {x^k-x^\ast } \right\|^2-\left( {1-\tau } \right)\left[{2\gamma\left\langle {x^k-x^\ast , x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right\rangle-\gamma ^2\left\| {x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right\|^2} \right]. $ (2.16)

$\Psi \left( \gamma \right)$的定义、式(2.16) 及$\left\| {a-c}\right\|^2=\left\langle {a-b, a-c} \right\rangle +\left\langle {b-c, a-c}\right\rangle $, 有

$ \Psi (\gamma ) =\left\| {x^k-x^\ast } \right\|^2-\left\| {x^{k+1}(\gamma)-x^\ast } \right\|^2\\\;\;\;\;\;\;\;\; \ge \left( {1-\tau } \right)\left[{2\gamma \left\langle {x^k-x^\ast, x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right\rangle-\gamma ^2\left\|{x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right\|^2} \right] \\\;\;\;\;\;\;\;\; =\left( {1-\tau } \right)\left[{2\gamma \left( {\left\| {x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right\|^2-\left\langle {x^\ast-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}}^k, x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right\rangle } \right)-\gamma ^2\left\|{x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right\|^2} \right] \\\;\;\;\;\;\;\;\; \ge \left( {1-\tau } \right)\left[\gamma \left( {2\left\| {x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right\|^2-\left\| {x^\ast-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}}^k} \right\|^2+\left\| {x^\ast-x^k} \right\|^2-\left\|{x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right\|^2} \right)-\gamma ^2\left\|{x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right\|^2 \right] \\\;\;\;\;\;\;\;\; = (1-\tau )\Theta (\gamma ). $

证毕.

注2.4  由于$\Psi \left(\gamma\right)$衡量了算法在第$k+1$步相对于第$k$步的改进, 通过选取合适的$\gamma$极大化$\Theta (\gamma )$, 使每次迭代产生的迭代量$\Psi \left( \gamma\right)$极大化.由式(2.15) 可知$\Theta (\gamma )$是关于$\gamma$的二次函数, 由$\Phi (\alpha )$的定义式及定理2.1, 则极大值点位于

$ \gamma _k^\ast =\frac{\left\| {x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right\|^2+\Phi (\alpha _k )}{2\left\| {x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right\|^2}. $

注2.5  为了加快收敛速度, 用松弛因子$\sigma \in \left({0, 1}\right)$乘以$\gamma _k^\ast $, 得到$\gamma _k $, 即

$ \gamma _k =\sigma \gamma _k^\ast =\sigma \frac{\left\| {x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right\|^2+\Phi (\alpha _k )}{2\left\| {x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right\|^2}. $ (2.17)

下面的定理2.3是算法2.1收敛性分析过程中一个重要的结论.

定理2.3  假设$x^\ast \in \Omega ^\ast $, $\sigma \in \left( {0, 1}\right)$, $\tilde{x}^k$, $\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k$$x^{k+1}\left( {\gamma _k }\right)$是由算法2.1生成的, 那么就有下式成立:

$ \Psi \left( {\gamma _k } \right)\ge \frac{\sigma \left( {2-\sigma }\right)\left( {1-\tau } \right)\delta ^2\left( {2-\delta } \right)^2\left({1+\mu ^2-\eta ^2} \right)^2}{64\left( {1+\mu } \right)^2}\left\|{x^k-\tilde {x}^k} \right\|^2. $

  由式(2.13), (2.14), (2.17) 及定理2.2, 有

$ \Psi \left( \gamma \right) \ge (1-\tau )\Theta (\gamma )\\\;\;\;\;\;\;\;\; =\left( {1-\tau } \right)\left\{ {\gamma _k \left[{\left\| {x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right\|^2+\left\| {x^k-x^\ast } \right\|^2-\left\| {\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k-x^\ast } \right\|^2} \right]-\gamma _k ^2\left\| {x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right\|^2} \right\} \\\;\;\;\;\;\;\;\; \ge \left( {1-\tau } \right)\left\{ {\gamma _k \left[{\left\| {x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right\|^2+\Phi \left( {\alpha _k } \right)} \right]-\gamma _k ^2\left\| {x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right\|^2} \right\} \\\;\;\;\;\;\;\;\; =\sigma \left( {1-\tau } \right)\left\{ {\gamma _k^\ast \left[ {\left\| {x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right\|^2+\Phi \left( {\alpha _k } \right)} \right]-\sigma \left( {\gamma _k^\ast } \right)^2\left\| {x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right\|^2} \right\} \\\;\;\;\;\;\;\;\; =\sigma \left( {2-\sigma } \right)\left( {1-\tau } \right)\gamma _k^\ast \frac{\left\| {x^k-\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {x}} ^k} \right\|^2+\Phi \left( {\alpha _k } \right)}{2} \\\;\;\;\;\;\;\;\; \ge \frac{\sigma \left( {2-\sigma } \right)\left( {1-\tau } \right)\delta ^2\left( {2-\delta } \right)^2\left( {1+\mu ^2-\eta ^2} \right)^2}{64\left( {1+\mu } \right)^2}\left\| {x^k-\tilde {x}^k} \right\|^2. $

证毕.

定理2.3表明, 算法2.1生成的序列$\left\{{x^k}\right\}$关于式(2.1) 的解点是Féjer单调的.故有如下推论成立.

推论2.1  令$x^\ast \in \Omega ^\ast $, 且$\left\{{\tilde {x}^k} \right\}$$\left\{ {x^k}\right\}$是由算法2.1生成, 那么

(1) 序列$\left\{ {x^k} \right\}$有界;

(2) 序列$\left\{ {\left\| {x^k-x^\ast } \right\|} \right\}$是非增的;

(3) $\mathop {\lim }\limits_{k\to \infty } \left\| {x^k-\tilde {x}^k} \right\|=0$; (4) 序列$\left\{ {\tilde {x}^k} \right\}$有界.

下面给出算法2.1的收敛性证明过程中的一个引理.

引理2.4  给定$x^k\in R_{++}^n $, $\beta _k >0$, 且$\tilde{x}^k$$\xi ^k$满足式(2.1) 和式(2.2), 那么对任意的$x\in R_+^n$有下式成立:

$ \left\langle {\tilde {x}^k-x, \beta _k F\left( {\tilde {x}^k} \right)-\xi ^k} \right\rangle \ge \left\langle {x^k-\tilde {x}^k, \left( {1+\mu } \right)x-\mu x^k-\tilde {x}^k} \right\rangle. $ (2.18)

  在引理2.2中, 假定$q=\beta _k F\left( {\tilde {x}^k} \right)-\xi^k$, $y=x$, 立有式(2.18) 成立.证毕.

最后给出算法2.1的全局收敛性定理.

定理2.4  如果$\inf _{k=0}^\infty \beta _k =\beta>0$, 那么由算法2.1生成的序列$\left\{ {x^k} \right\}$收敛到$NCP\left( F\right)$的一个解$x^\infty $.

利用参考文献[5]类似的证明方法, 由定理2.3, 推论2.1及引理2.4, 此定理可被证明, 不再赘述.

2.3 数值实验

本节将算法2.1与文献[4]中的算法作比较.考虑如下$NCP\left( F\right)$:

$ x\ge 0, \quad F\left( x \right)\ge 0, \quad x^TF\left( x \right)=0, $

其中$F\left( x \right)=D\left( x \right)+Mx+q$.式中的$D\left(x\right)$$Mx+q$分别为$F\left( x\right)$的非线性项和线性项.特别地, 令矩阵$M=A^TA+B$, 其中$A$是元素在区间$\left({-5, +5} \right)$上随机生成的$n\times n$阶矩阵, $B$是元素在区间$\left({-5, +5} \right)$上随机生成的$n\times n$阶反对称矩阵.向量$q\in R^n$的元素在区间$\left({-200, +300} \right)$上服从一致分布. $D\left( x\right)$的分量为$D_j \left( x \right)=d_j \ast \arctan \left( {x_j } \right)$, 其中$d_j \in \left( {0, 1}\right)$.文献[4-6, 14]测试了类似的问题.

测试上述${\rm NCP}\left( F\right)$的维数$n$从100到1000.为了与文献[4]中的算法作对比, 选取相同的参数:$\beta_0 =1$, $\eta =0.91$, $\mu =0.01$, $\delta =\sigma =1.99$, $\tau=0.01$; 终止准则为

$ \left\| {\min \left\{ {x^k, F\left( {x^k} \right)} \right\}} \right\|_\infty \le 10^{-7}, $

初始点为$x^0=\left( {0, 0, \cdots, 0}\right)^T$$x^0=\left({1, 1, \cdots, 1} \right)^T$.

利用MATLAB2013a编程, 对比结果见下表 2.12.2所示, 其中No.iter.表示迭代次数; CPU(s)表示计算时间, 单位为秒.

表 2.1 初始点为$x^0=\left( {0, 0, \cdots, 0} \right)^T$时的数值实验结果

表 2.2 初始点为$x^0=\left( {1, 1, \cdots, 1} \right)^T$时的数值实验结果

从表中两种算法的迭代次数和计算时间来看, 虽然数值实验的部分结果较文献[13]中提出的算法表现差, 但整体上, 算法2.1比文献[4]中提出的算法更加有效.例如, 从表 2.1中可看出当维数为400时, 算法2.1的No.iter.和CPU($s$)分别为278次和2.16秒差于文献[4]中所提算法的269次的迭代次数和1.91秒的计算时间.但是, 从表 2.2中可以看出当维数为800时, 本文所改进的算法的No.iter.和CPU($s$)分别为309次和16.32秒, 显然要优于文献[4]中所提算法的311次的迭代次数和16.63秒的计算时间.

3 非线性互补问题的Levenberg-Marquardt算法
3.1 Levenberg-Marquardt算法的构造及适定性分析

算法3.1

步骤0  给定初始点$x^0\in R^n$.选取参数$\eta \in \left({0, 1} \right)$, $\lambda \in \left( {0, 1} \right)$, $\alpha \in \left( {0, 0.8} \right)$, $\sigma \in \left( {0, \frac{1}{2}\left({1-\alpha } \right)} \right)$, $\delta \in \left( {0, \left. 2 \right]} \right.$.令$\mu _0=\left\| {\Phi \left( {x^0} \right)} \right\|^\delta $, $\kappa :=\sqrt {2n} $, $\beta _0 :=\left\| {\Phi \left( {x^0} \right)} \right\|$, $\tau _0 :=\left({\frac{\alpha }{2\kappa }\beta _0 } \right)^2$, $k:=0$.

步骤1  如果$x^k$满足$\left\| {\nabla \Psi \left( {x^k} \right)} \right\|\le\varepsilon $, 则算法终止.

步骤2  计算如下LM线性方程组, 得到搜索方向$d_L^k \in R^n$,

$ \left( {\Phi _{\tau _k }' \left( {x^k} \right)^T\Phi _{\tau _k }' \left( {x^k} \right)+\mu _k I} \right)d_L^k =-\Phi _{\tau _k }' \left( {x^k} \right)^T\Phi _{\tau _k } \left( {x^k} \right), $ (3.1)

其中$\mu _k =\left\| {\Phi \left( {x^k} \right)} \right\|^\delta$.

步骤3  计算如下方程组, 得到修正的搜索方向$d_G^k \in R^n$,

$ \left( {\Phi _{\tau _k }' \left( {x^k} \right)^T\Phi _{\tau _k }' \left( {x^k} \right)+\mu _k I} \right)d_G^k =-\Phi _{\tau _k }' \left( {x^k} \right)^T\Phi _{\tau _k } \left( {x^k} \right)+\mu _k d_L^k. $ (3.2)

步骤4  令$\sigma _k =\min \left\{ {\sigma, \frac{1}{4}\mu_k } \right\}$,并计算使下式成立的最小的非负整数 $s_k \in \left\{{0, 1, \cdots} \right\}$:

$ \Psi _{\tau _k } \left( {x^k+\lambda ^{s_k }d_G^k } \right)\le \Psi _{\tau _k } \left( {x^k} \right)-\sigma _k \lambda ^{s_k }\left\| {d_G^k } \right\|^2. $ (3.3)

$t_k :=\lambda ^{s_k }$, $x^{k+1}=x^k+t_k d_G^k $.

步骤5  若

$ \left\| {\Phi \left( {x^{k+1}} \right)} \right\|\le \max \left\{ {\eta \beta _k, \frac{1}{\alpha }\left\| {\Phi \left( {x^{k+1}} \right)-\Phi _{\tau _k } \left( {x^{k+1}} \right)} \right\|} \right\}, $ (3.4)

则令$\beta _{k+1} :=\left\| {\Phi \left( {x^{k+1}} \right)} \right\|$, 并选取合适的$\tau _{k+1} $, 满足

$ 0<\tau _{k+1} \le \min\left\{ {\left( {\frac{\alpha }{2\kappa }\beta _{k+1}} \right)^2, \frac{1}{4}\tau _k, \bar {\tau }\left( {x^{k+1}, \gamma \beta _{k+1} } \right)} \right\}, $ (3.5)

其中$\bar {\tau }\left( . \right)$的定义见引理3.2.否则, 令$\beta _{k+1}:=\beta _k $, 并选取合适的$\tau _{k+1} $, 满足

$ 0<\tau _{k+1} \le \min\left\{ {\left( {\frac{\alpha }{2\kappa }\left\| {\Phi \left( {x^{k+1}} \right)} \right\|} \right)^2, \frac{1}{4}\tau _k } \right\}. $ (3.6)

步骤6  令$k:=k+1$, 转到步骤1.

注3.1  在算法3.1中, 对搜索方向$d_L^k$进行了修正, 使得新的搜索方向更接近Moore-Penrose步.另外, 为使算法更加精确有效, 针对文献[15]中的算法中所涉及到的一些参数及不等式关系做了适当的修正.

为了讨论简便, 定义指标集

$ K:=\left\{ 0 \right\}\cup \left\{ {k\in N\left| {\left\| {\Phi \left( {x^k}\right)} \right\|\le \max \left\{ {\eta \beta _{k-1} , \frac{1}{\alpha}\left\| {\Phi \left( {x^k} \right)-\Phi _{\tau _k -1} \left( {x^k} \right)}\right\|} \right\}} \right.} \right\}. $ (3.7)

下面先给出连续函数的广义Jacobian的定义.

定义3.1  假定$G:R^n\to R^n$是局部Lipschitz连续函数.那么, $G$在点$x\in R^n$处的Clarke[16]意义下的广义Jacobian定义为

$ \partial G\left( x \right)=\mbox{conv}\left\{ {H\in R^{n\times n}\left| {H=\mathop {\lim}\limits_{x^k\to x} G'\left( {x^k} \right), x^k\in D_G } \right.} \right\}, $

其中$D_G $是由$G$的可微点组成的集合, $\mbox{conv}\left( A \right)$是集合$A$的凸包.

Qi [17]$\partial _C G\left( x \right)^T$$G$在点$x\in R^n$处的$C$ -次微分, 其中$\partial_C G\left( x \right)^T:=\partial G_1 \left( x \right)\times \cdots\cdots \times \partial G_n \left( x \right).$

下面的引理是分析算法收敛性时必要的结论.

引理3.1

(1)[8]  对任意的$x\in R^n$, $\tau _1, \tau _2 \ge 0$, 成立$\left\| {\Phi _{\tau _1 } -\Phi _{\tau _2 } }\right\|\le \kappa \left|{\sqrt {\tau _1 } -\sqrt {\tau _2 } }\right|$, 其中$\kappa =\sqrt {2n} $.特别地, 对任意的$x\in R^n$, $\tau\ge 0$, 有$\left\| {\Phi _\tau -\Phi } \right\|\le \kappa\sqrt \tau $.

(2)[15]  对所有的$k\in N$, 成立

$ \left\| {\Phi _{\tau _k } \left( {x^{k+1}} \right)} \right\|<\left\| {\Phi_{\tau _k } \left( {x^k} \right)} \right\| $

$ \left\| {\Phi _{\tau _k } \left( {x^{k+1}} \right)} \right\|+\kappa \sqrt{\tau _{k+1} } \le \left\| {\Phi _{\tau _k } \left( {x^k} \right)}\right\|+\kappa \sqrt {\tau _k }. $

本章所给出算法中, 涉及到$\bar {\tau }$的表达式, 因此给出下面一个结论.

引理3.2[9]  假定$x\in R^n$是任意给定的向量, $x$不是${\rm NCP}\left( F \right)$的解.定义如下常量

$ \gamma \left( x \right):=\mathop {\max}\limits_{i\notin \beta \left( x \right)} \left\{ {\left\| {x_i e_i +F_i \left( x \right)\nabla F_i \left( x \right)} \right\|} \right\}\ge 0 $

$ \alpha \left( x \right):=\mathop {\max}\limits_{i\notin \beta \left( x \right)} \left\{ {x_i^2 +F_i \left( x \right)^2} \right\}>0, $

其中$\beta \left( x \right):=\left\{ {i\left| {x_i =F_i \left( x \right)=0} \right.}\right\}$.假定$\delta $为一给定的常数, 定义

$ \bar {\tau }\left( {x, \delta } \right):= \begin{cases} 1, \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; \dfrac{n\gamma \left( x \right)^2}{\delta ^2}-\alpha \left( x\right)\le 0, \\ \dfrac{\alpha \left( x \right)^2}{2}\dfrac{\delta ^2}{n\gamma \left( x\right)^2-\delta ^2\alpha \left( x \right)}, \dfrac{n\gamma \left( x \right)^2}{\delta ^2}-\alpha \left( x\right)> 0, \end{cases} $

则对任意的$\tau \in \left( {0, \left. {\bar {\tau }\left({x, \delta }\right)} \right]} \right.$, 有$\textrm{dist}\left({\Phi _\tau ' \left( x\right), \partial _C \Phi \left( x \right)}\right)\le \delta $.

下面证明算法3.1的适定性.

定理3.1  算法3.1是适定的.

  只要证明当$\lambda$足够小时, Armijo线性搜索准则满足即可.由式(3.1), (3.2), $0<\sigma_k \le \mu _k $$\nabla \Psi _\tau \left( x \right)=\left( {\Psi _\tau ' \left( x \right)} \right)^T=\left({\Phi _\tau ' \left( x \right)}\right)^T\Phi _\tau \left( x \right)$, 对足够小的$\lambda $

$ \;\;\;\;\;\;\; \Psi _{\tau _k } \left( {x^k+\lambda d_G^k } \right) -\Psi _{\tau _k } \left({x^k} \right) =\lambda \nabla \Psi _{\tau _k } \left( {x^k} \right)^Td_G^k +o\left(\lambda \right) \\\;\; \le -\lambda \left( {\Phi _{\tau _k } \left( {x^k} \right)^T\Phi _{\tau _k}' \left( {x^k} \right)-\mu _k \left( {d_L^k } \right)^T} \right)d_G^k+o\left( \lambda \right)\\\;\; =-\lambda \left( {d_G^k } \right)^T\left( {\nabla \Phi _{\tau _k } \left({x^k} \right)\Phi _{\tau _k }' \left( {x^k} \right)+\mu _k I} \right)d_G^k+o\left( \lambda \right)\\\;\; \le -\lambda \mu _k \left\| {d_G^k } \right\|^2+o\left( \lambda \right) \le -\lambda \sigma _k \left\| {d_G^k } \right\|^2+o\left( \lambda \right). $

故算法3.1是适定的.证毕.

3.2 收敛性分析

这一部分分析算法3.1的全局收敛性.先介绍一个引理.

引理3.3[8]  令$\left\{{x^k}\right\}$是由算法3.1生成的序列.假设该序列至少有一个聚点$x^\ast $, 则有

(1) 如果$x^\ast $${\rm NCP}\left( F\right)$的一个解, 则指标集$K$是无限集, 且$\left\{ {\tau _k } \right\}\to 0$.

(2) 指标集$K$是无限集, 当且仅当$\left\{{x^k}\right\}$的每个聚点都是${\rm NCP}\left( F\right)$的一个解.

下面利用奇异值分解定理简化$d_L^k $, $d_G^k $.由式(3.1), (3.2) 知

$ d_L^k =-\left[{\nabla \Phi _{\tau _k } \left( {x^k} \right)\Phi _{\tau _k}' \left( {x^k} \right)+\mu _k I} \right]^{-1}\nabla \Phi _{\tau _k }\left( {x^k} \right)\Phi _{\tau _k } \left( {x^k} \right), \\ d_G^k =-\left[{\nabla \Phi _{\tau _k } \left( {x^k} \right)\Phi _{\tau _k}' \left( {x^k} \right)+\mu _k I} \right]^{-1}\left[ {\nabla \Phi _{\tau _k} \left( {x^k} \right)\Phi _{\tau _k } \left( {x^k} \right)-\mu _k d_L^k }\right]. $

由奇异值分解定理, $\Phi _{\tau _k }' \left( {x^k}\right)$被分解为

$ \Phi _{\tau _k }' \left( {x^k} \right)=U_k \Sigma _k V_k^T =U_k \left[{\begin{array}{cccc} \sigma _{k, 1}&&& \\ &\sigma _{k, 2}&& \\ & &\ddots &\\ & & &\sigma _{k, n} \\ \end{array}} \right]V_k^T, $

其中$U_k $, $V_k \in R^{n\times n}$是正交矩阵, 且$\sigma_{k, 1} \ge \sigma_{k, 2} \ge \cdots\ge \sigma _{k, n} \ge 0$.那么有

$ d_L^k =-V_k \Lambda _k V_k^T \nabla \Psi _{\tau _k } \left( {x^k} \right), \notag \\ d_G^k =-V_k \Lambda _k V_k^T \left[{\nabla \Psi _{\tau _k } \left( {x^k}\right)-\mu _k d_L^k } \right] \notag \\\;\;\;\; =-V_k \Lambda _k V_k^T \left[{I+\mu _k V_k \Lambda _k V_k^T } \right]\nabla\Psi _{\tau _k } \left( {x^k} \right) \notag \\\;\;\;\; =-V_k \left( {\Lambda _k \Theta _k } \right)V_k^T \nabla \Psi _{\tau _k }\left( {x^k} \right) \notag \\\;\;\;\; =-V_k X_k V_k^T \nabla \Psi _{\tau _k } \left( {x^k} \right), $ (3.8)

其中

$ \begin{eqnarray*} && \Lambda _k =\left[{\begin{array}{l} \frac{1}{\sigma _{k, 1}^2 +\mu _k } \\ \hspace{1cm}\ddots \\ \hspace{1.5cm}\frac{1}{\sigma _{k, n}^2 +\mu _k } \\ \end{array}} \right], \Theta _k =\left[{\begin{array}{l} 1\mbox{+}\frac{\mu _k }{\sigma _{k, 1}^2 +\mu _k } \\ \hspace{1.5cm} \ddots \\ \hspace{2cm}\frac{\mu _k }{\sigma _{k, n}^2 +\mu _k } \\ \end{array}} \right], \\ && X_k =\left[{\begin{array}{l} \frac{1}{\sigma _{k, 1}^2 +\mu _k }\left( {1\mbox{+}\frac{\mu _k }{\sigma _{k, 1}^2 +\mu _k }} \right) \\ \hspace{3cm}\ddots \\ \hspace{3.5cm}\frac{1}{\sigma _{k, n}^2 +\mu _k }\left( {1\mbox{+}\frac{\mu _k }{\sigma _{k, n}^2 +\mu _k }} \right) \\ \end{array}} \right]. \end{eqnarray*} $

那么

$ \left\| {d_G^k } \right\|^2=\nabla \Psi _{\tau _k } \left( {x^k} \right)^TV_k X_k^2 V_k^T \nabla \Psi _{\tau _k } \left( {x^k} \right). $ (3.9)

最后, 采用文献[14]中收敛性分析相似的思想, 证明算法3.1的全局收敛性.

定理3.2  假设$\left\{{x^k}\right\}$是由算法3.1生成的序列, $\left\{ {x^k} \right\}_L$是其收敛于$x^\ast$的一个子列, 则$x^\ast $是函数$\Psi$的稳定点.

  假设指标集$K$是无限集, 由引理3.3知, 结论成立.

假设指标集$K$是有限集.令$K\cap L=\emptyset$, $\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {k}} $$K$中最大的数.那么, 由算法3.1中步骤5知对所有的$k>\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {k}}$, 有

$ \tau _k \le \tau _{\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {k}} }, \beta _k =\beta_{\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {k}} } =\left\| {\Phi \left({x^{\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {k}} }} \right)} \right\|, \left\| {\Phi \left( {x^k} \right)} \right\|>\eta\beta _{k-1} =\eta \left\| {\Phi \left( {x^{\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {k}}}} \right)} \right\|>0. $

那么对所有的$k>\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {k}}$,

$ \Psi \left( {x^k} \right)=\frac{1}{2}\left\| {\Phi \left( {x^k} \right)} \right\|^2>\frac{1}{2}\eta ^2\left\| {\Phi \left( {x^{\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {k}} }} \right)} \right\|^2=\eta ^2\Psi \left( {x^{\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over {k}} }} \right)>0 . $

下面证明$\nabla \Psi \left( {x^\ast } \right)=0$.如若不然, 有$\nabla \Psi\left( {x^\ast } \right)\ne 0$.

首先证明$\mathop {\lim }\limits_{k\in L} \inf t_k =0$.假设$\mathop {\lim}\limits_{k\in L} \inf t_k =t^\ast>0$.由Armijo线搜索准则(3.3), 对所有的$k\in L$,

$ \Psi _{\tau _k } \left( {x^{k+1}} \right)-\Psi _{\tau _k } \left( {x^k}\right)\le -\sigma _k t_k \left\| {d_G^k } \right\|^2 . $ (3.10)

由于$\left\{ x \right\}_L \to x^\ast $, $\left\{ {\tau _k }\right\}\to 0$, 则存在某个$m>0$使得对所有的$k\in L$, $i=1, \cdots, n$, 有$\left| {\sigma _{k, i}^2 } \right|<m$.由假设$\nabla \Psi \left( {x^\ast } \right)\ne 0$, $\left\{ {\mu_k } \right\}_L \to \mu ^\ast =\left\| {\Phi \left( {x^\ast }\right)} \right\|^\delta >0$, 故存在向量$M>0$, 对充分大的$k\in L$, 有

$ \left\| {d_G^k } \right\|^2\ge \left[{\frac{1}{m+\mu _k }\left( {1+\frac{\mu _k }{m+\mu _k }} \right)} \right]^2\left\| {\nabla \Phi _{\tau _k } \left( {x^k} \right)} \right\|^2\ge M\left\| {\nabla \Phi _{\tau _k } \left( {x^k} \right)} \right\|^2. $ (3.11)

由式(3.10), (3.11), 对充分大的$k\in L$, 有

$ \Psi _{\tau _k } \left( {x^{k+1}} \right)-\Psi _{\tau _k } \left( {x^k} \right)\le -\sigma _k t_k M\left\| {\nabla \Phi _{\tau _k } \left( {x^k} \right)} \right\|^2 . $ (3.12)

由于$K$是有限集, $\left\{ {\sigma _k } \right\}_L \to \sigma^\ast =\min \left\{ {\sigma, \frac{1}{4}\left\| {\Phi \left({x^\ast } \right)} \right\|^\delta } \right\}>0$$\left\{ {\tau_k } \right\}\to 0$, 假设$c:=\sigma ^\ast t^\ast M\left\|{\nabla \Phi _{\tau _k } \left( {x^k} \right)} \right\|^2$, 那么由式(3.12), 对充分大的$k\in L$, 有

$ \Psi _{\tau _k } \left( {x^{k+1}} \right)-\Psi _{\tau _k } \left( {x^k} \right)\le -\frac{c}{2} . $ (3.13)

假定$L=\left\{ {l_0, l_1, l_2 \cdots.} \right\}$.由引理3.1有$\left\| {\Phi \left( {x^{l_{j+1} }} \right)} \right\|\le \left\|{\Phi \left( {x^{l_j +1}} \right)} \right\|+2\kappa \sqrt {\tau_{l_j +1} }.$那么对充分大的$l_j $, 有

$ \Psi \left( {x^{l_{j+1} }} \right) =\frac{1}{2}\left\| {\Phi \left({x^{l_{j+1} }} \right)} \right\|^2 \le \left( {\left\| {\Phi \left( {x^{l_j +1}} \right)} \right\|+2\kappa \sqrt {\tau _{l_j +1} } } \right)^2 \notag \\\;\;\;\;\;\;\;\;\;\;\;\;\;\; =\Psi \left( {x^{l_j +1}} \right)+2\kappa \sqrt {\tau _{l_j +1} } \left\| {\Phi \left( {x^{l_j +1}} \right)} \right\|+2\kappa ^2\tau _{l_j +1} =\Psi \left( {x^{l_j +1}} \right)+\frac{c}{4}. $ (3.14)

由式(3.13)、式(3.14), 对充分大的$l_j $, 有

$ \Psi \left( {x^{l_{j+1} }} \right)-\Psi \left( {x^{l_j }} \right)=\left( {\Psi \left( {x^{l_{j+1} }} \right)-\Psi \left( {x^{l_j +1}} \right)} \right)+\left( {\Psi \left( {x^{l_j +1}} \right)-\Psi \left( {x^{l_j }} \right)} \right)\le -\frac{c}{4}, $

这与$\Psi \left( x \right)\ge 0$矛盾, 故$\mathop {\lim}\limits_{k\in L} \inf t_k =0$.因此可以令$\mathop {\lim}\limits_{k\in L} t_k=0$.那么对于足够大的$k\in L$, $\lambda^{s_k -1}$总不满足式(3.3), 即对充分大的$k\in L$, 有

$ \frac{\Psi _{\tau _k } \left( {x^k+\lambda ^{s_k -1}d_G^k } \right)-\Psi _{\tau _k } \left( {x^k} \right)}{\lambda ^{s_k -1}}>-\sigma \left\| {d_G^k } \right\|^2 . $ (3.15)

假定$\left\{ {d^k} \right\}_L \to d^\ast $, $\left\{ {\sigma _k }\right\}_L \to \sigma ^\ast $, 由式(3.15) 有

$ \nabla \Psi \left( {x^\ast } \right)^Td^\ast \ge -\sigma ^\ast \left\| {d^\ast } \right\|^2. $ (3.16)

假定$\left\{ {V_k } \right\}_L \to V_\ast $, $\left\{ {X_k }\right\}_L \to X_\ast $, $\mu ^\ast =\left\| {\Phi \left( {x^\ast} \right)} \right\|^\delta >0$, 其中

$ X_\ast =\textrm{diag}\left( {\frac{1}{\sigma _1^2 +\mu ^\ast }\left( {1+\frac{\mu ^\ast }{\sigma _1^2 +\mu ^\ast }} \right), \cdots, \frac{1}{\sigma _n^2 +\mu ^\ast }\left( {1+\frac{\mu ^\ast }{\sigma _n^2 +\mu ^\ast }} \right)} \right). $

由式(3.8), (3.9) 及式(3.16), 则有

$ -\nabla \Psi \left( {x^\ast } \right)^TV_\ast X_\ast V_\ast ^T \nabla \Psi \left( {x^\ast } \right)\ge -\sigma ^\ast \nabla \Psi \left( {x^\ast } \right)^TV_\ast X_\ast ^2 V_\ast ^T \nabla \Psi \left( {x^\ast } \right). $ (3.17)

$\sigma ^\ast \le \frac{1}{4}\mu ^\ast $, 则由式(3.17) 有$\nabla \Psi\left( {x^\ast } \right)=0$.证毕.

3.3 数值实验

将算法3.1与文献[15]中的算法做比较.给出两个算例, 说明算法3.1的优越性及有效性.

参数设定如下: $\varepsilon =1.0e-7$, $\eta =0.8$, $\lambda =0.5$, $\alpha =0.7$, $\sigma =0.05$, $\delta =2$, $\kappa :=\sqrt {2n}$.

例1  假定$F\left( x \right)=Mx+q$, 其中

$ M=\left[{{\begin{array}{*{20}c} 4 \hfill&1 \hfill&0 \hfill&{\cdots} \hfill&0 \hfill \\ {-2} \hfill&4 \hfill&1 \hfill&{\cdots} \hfill&0 \hfill \\ 0 \hfill&{-2} \hfill&4 \hfill&{\cdots} \hfill&0 \hfill \\ {\cdots} \hfill&{\cdots} \hfill&{\cdots} \hfill&{\cdots} \hfill&{\cdots} \hfill \\ 0 \hfill&0 \hfill&0 \hfill&{\cdots} \hfill&1 \hfill \\ 0 \hfill&0 \hfill&0 \hfill&{\cdots} \hfill&4 \hfill \\ \end{array} }} \right], \quad q=\left( {-1, -1, \cdots, -1} \right)^T. $

例2  (Kojima-Shindo Problem)假定$F\left( x\right)=\left( {f_1 \left( x \right), f_2 \left( x \right), f_3\left( x \right), f_4 \left( x \right)} \right)^T$, 其中

$ f_1 \left( x \right)=3x_1^2 +2x_1 x_2 +2x_2^2 +x_3 +3x_4 -6, \\ f_2 \left( x \right)=2x_1^2 +x_1 +x_2^2 +10x_3 +2x_4 -2, \\ f_3 \left( x \right)=3x_1^2 +x_1 x_2 +2x_2^2 +2x_3 +9x_4 -9, \\ f_4 \left( x \right)=x_1^2 +3x_2^2 +2x_3 +3x_4 -3. \\ $

该问题有两个解: $x^1=\left( {\frac{\sqrt 6 }{2}, 0, 0, 0.5}\right)^T$$x^2=\left( {1, 0, 3, 0} \right)^T$.

利用MATLAB2013a编程, 实验结果见下表 3.1所示, 其中No.iter.表示迭代次数. NF.表示计算函数$F$值的次数.

表 3.1 数值实验结果

从表中两种算法的迭代次数和函数$F$值的计算次数来看, 算法3.1比文献[15]中提出的算法更加有效.例如, 表 3.1中, 问题1和问题2经过两种算法的测试后, 算法3.1的迭代次数及函数$F$值的计算次数明显小于文献[15]中的算法.

参考文献
[1] Martinet B. Determination approachée d'un point flxe d'une application pseudo-contractante[J]. Comptes Rendus de l'Académie des Sci. Ser. Ⅰ Math., 1972, 274(2): 163–165.
[2] He B S, Liao L Z, Yuan X M. A LQP based interior prediction-correction method for nonlinear complementarity problems[J]. J. Comput. Math. Intern. Edi., 2006, 24(1): 33–44.
[3] Bnouhachem A. An LQP Method for pseudomonotone variational inequalities[J]. J. Glob. Optim., 2006, 36(3): 351–363. DOI:10.1007/s10898-006-9013-4
[4] Noor M A, Bnouhachem A. Modifled proximal point method for nonlinear complementarity problems[J]. J. Comput. Appl. Math., 2006, 197(2): 395–405. DOI:10.1016/j.cam.2005.08.028
[5] Xu Y, He B S, Yuan X. A hybrid inexact logarithmic-quadratic proximal method for nonlinear complementarity problems[J]. J. Math. Anal. Appl., 2006, 322(1): 276–287. DOI:10.1016/j.jmaa.2005.08.011
[6] Yuan X M. The prediction-correction approach to nonlinear complementary problems[J]. Euro. J. Oper. Res., 2007, 176(3): 1357–1370. DOI:10.1016/j.ejor.2005.11.006
[7] 陈金雄, 刘宁. 解P0非线性互补问题的光滑Levenberg-Marquardt方法[J]. 数学杂志, 2015, 35(4): 905–916.
[8] Kanzow C, Pieper H. Jacobian smoothing methods for nonlinear complementarity problems[J]. SIAM J. Optim., 1999, 9(2): 342–373. DOI:10.1137/S1052623497328781
[9] Yang Y F, QiL. Smoothing trust region methods for nonlinear complementarity problems with P0-functions[J]. Ann. Oper. Res., 2005, 133(1-4): 99–117. DOI:10.1007/s10479-004-5026-x
[10] Kanzow C. Some noninterior continuation methods for linear complementarity problems[J]. SIAM J. Matrix Anal. Appl., 1996, 17(4): 851–868. DOI:10.1137/S0895479894273134
[11] Qi H, Liao L. A smoothing Newton method for general nonlinear complementarity problems[J]. Comput. Optim. Appl., 2000, 17(2-3): 231–253.
[12] Yamashita N, Fukushima M. On the rate of convergence of the Levenberg-Marquardt method[J]. Comput., 2001, 15: 239–249.
[13] Dan H, Yamashita N, Fukushima M. Convergence properties of the inexact Levenberg-Marquardt method under local error bound conditions[J]. Optim. Meth. Soft., 2002, 17(4): 605–626. DOI:10.1080/1055678021000049345
[14] Bnouhachem A, Noor M A. A new logarithmic-quadratic proximal method for nonlinear complementarity problems[J]. Appl. Math. Comput., 2009, 215(2): 695–706.
[15] Yu H, Pu D. Smooting Levenberg-Marquardt method for general nonlinear complementarity problems under local error bound[J]. Appl. Math. Model., 2011, 35(3): 1337–1348. DOI:10.1016/j.apm.2010.09.012
[16] Clark F H. Optimization and nonsmooth analysis[M]. New York: John Wiley, 1983.
[17] Qi L. C-difierentiability, C-difierential operators and generalized Newton methods[J]. Sydney: Appl. Math. Report AMR96/5, Univ. New South Wales, 1996.