数学杂志  2016, Vol. 36 Issue (2): 365-374   PDF    
扩展功能
加入收藏夹
复制引文信息
加入引用管理器
Email Alert
RSS
本文作者相关文章
高雷阜
潘京乐
魏帅
三个可分离算子凸优化的线性化方法
高雷阜, 潘京乐, 魏帅     
辽宁工程技术大学理学院, 辽宁 阜新 123000
摘要:本文研究了三个可分离算子不含交叉变量的线性约束凸优化问题.利用定制的邻近点算法,对其变分不等式子问题进行线性化处理,并增加一邻近点项,使其子问题成为易于运算的单调线性变分不等式,得到了线性化定制的邻近点算法,并证明了全局收敛性,推广了文献中的研究结果.
关键词可分离算子线性约束问题    交替方向法    变分不等式    全局收敛性    
A LINEARIZED METHOD FOR THE SEPARABLE CONVEX PROGRAMMING WITH OBJECTIVE FUNCTION REPRESENTED AS THREE FUNCTIONS
GAO Lei-fu, PAN Jing-le, WEI Shuai     
College of Science, Liaoning Technical University, Fuxin 123000, China
Abstract: In this paper, we study the convex minimization problem with linear constraints and a block-separable objective function which is represented as the sum of three functions without coupled variables. Based on the customized proximal point algorithm, linearing its variable inequality subproblem and adding an approximate item to the subproblem, the new linearized method is offered and its global convergence is proved finally. The results have extended the corresponding studies documented.
Key words: linear constraints problems with separable operators     alternating direction method of multipliers     variational inequality     global convergence    
1 引言

对于目标函数不含交叉变量的三个可分离算子线性约束的凸优化最小值问题, 数学表达式

${\rm{min}}\left\{ {\sum\limits_{i = 1}^3 {{\theta _i}({x_i})} \left| {\sum\limits_{i = n}^3 {{A_i}{x_i} = b,{x_i} \in {X_i},i = 1,2,3} } \right.} \right\}$ (1.1)

其中 ${A_i} \in {R^{m \times ni}}(i = 1,2,3),b \in {R^m},{X_i}\underline \subset {R^{mi}}(i = 1,2,3)$ 为非空闭凸集, ${n_1} + {n_2} + {n_3} = n$${\theta _i}:{X_i} \to {R^{ni}}(i = 1,2,3)$是凸函数(不必光滑) .

因映射$\theta _i \left( {i=1,2,3} \right)$只与变量$x_i $有关, 称这类变分不等式为可分离的变分不等式.虽然模型 (1.1)结构特殊, 但它在很多领域都有很重要的作用, 比如在文献[1]提到的fused Lasso 问题, 文献[2]涉及的低秩稀疏矩阵的复原问题.

为便于讨论, 假设

(i) 模型(1.1)的解集$\Omega ^\ast $非空;

(ii) 矩阵$A_i \left( {i=1,2,3} \right)$均为列满秩的.

对于问题 (1.1), 文献[4- 5]提到的增广拉格朗日乘子法 (augemented Lagrangian method, ALM) 是可行的, 其拉格朗日函数可表示为

$L_\beta \left( {x_1 ,x_2 ,x_3 ,\lambda } \right):=\sum\limits_{i=1}^3 {\theta _i \left( {x_i } \right)} -\lambda ^T\left( {\sum\limits_{i=1}^3 {A_i x_i } -b} \right)+\frac{\beta }{2}\left\| {\sum\limits_{i=1}^3 {A_i x_i } -b} \right\|^2,$ (1.2)

其中$\lambda \in R^m$是增广的拉格朗日乘子, $\beta >0$是事先设定的罚参数.相应的模型 (1.1)拉格朗日乘子收缩算法框架.

对给定的$\lambda ^k$, 先求

$\left( {x_1^{k+1} ,x_2^{k+1} ,x_3^{k+1} } \right)\leftarrow {\hbox{Arg}}\min \left\{ {L_\beta \left( {x_1 ,x_2 ,x_3 ,\lambda ^k} \right)\left| {x_i \in X_i ,i=1,2,3} \right.} \right\},$ (1.3)

然后令

$\lambda ^{k+1}=\lambda ^k-\beta \left( {\sum\limits_{i=1}^3 {A_i x_i } -b} \right),$ (1.4)

ALM算法每次迭代过程要求所有的自变量同时最小化, 将其推广用于解决凸优化最小值问题(1.1)的交替方向法 (alternating direction method of multipliers, ADM) 的算迭代框架为

$\left\{ {\begin{array}{l} x_1^{k+1} \leftarrow {\hbox{Arg}}\min \left\{ {L_\beta \left( {x_1 ,x_2^k ,x_3^k ,\lambda ^k} \right)\left| {x_1 \in X_1 } \right.} \right\}, \\ x_2^{k+1} \leftarrow {\hbox{Arg}}\min \left\{ {L_\beta \left( {x_1^{k+1} ,x_2 ,x_3^k ,\lambda ^k} \right)\left| {x_2 \in X_2 } \right.} \right\}, \\ x_3^{k+1} \leftarrow {\hbox{Arg}}\min \left\{ {L_\beta \left( {x_1^{k+1} ,x_2^{k+1} ,x_3 ,\lambda ^k} \right)\left| {x_3 \in X_3 } \right.} \right\}, \\ \lambda ^{k+1}=\lambda ^k-\beta \left( {\sum\limits_{i=1}^3 {A_i x_i^{k+1} } -b} \right). \\ \end{array}}\right.$ (1.5)

自从它在文献[6- 7]中被提出来以后, ADM算法受到了很大关注, 见文献[8- 12], 并发现它在科学计算的很多领域都有重要的作用, 见文献[13- 16]. 但是, 注意到尽管从经验上其数值实验表现很不错[2, 20], 但对于三个可分离算子的ADM算法, 其收敛性依然未从理论上得到证明, 这激励一些建立在ADM基础上的算法的诞生, 见文献[17- 19]. 类似于模型(1.1)的分离结构, 文献[17- 19,21]讨论的算法模型甚至是多个(大于三个)可分离算子凸优化的收缩算法.

值得注意的是文献[3, 24-26]中的大量数值结果表明, 当用交替方向法求解各类问题时, 罚参数$\beta $对算法的数值表现影响较大: 较小的$\beta $值虽然能保证算法的收敛性, 但是会导致收敛速度变慢; 而一个较大的$\beta $值可能会导致算法不收敛. 因此, 另一改进策略是克服固定罚参数$\beta $带来的弊端, 即允许$\beta $在迭代过程中单调变化, 或自适应的调节.

文献[21]提出的定制的邻近点算法中对多个(大于三个)可分离算子凸优化问题的分析及算法的收敛性证明, 使得多个可分离算子线性约束凸优化问题的研究更深一层.其中作为特例, 令$\alpha \in \left( {0,1} \right)$$\beta >0$, 借助于初始迭代向量$w^0=\left( {x_1^0 ,x_2^0 ,x_3^0 ,\lambda ^0} \right)$, 三个可分离算子定制的邻近点交替方向算法的迭代框架为

(1) 预测阶段, 由给定的$w^0$, 先求

$\tilde x_i^k = {\rm{Arg}}\min \left\{ {{\theta _i}\left( {{x_i}} \right) + \frac{\beta }{{2 \times 3}}{{\left\| {{A_i}\left( {3{x_i} - 2x_i^k} \right) + \sum\limits_{j = 1,j \ne i}^3 {{A_j}x_j^k} - b - \frac{1}{\beta }{\lambda ^k}} \right\|}^2}\left| {{x_i} \in {X_i}} \right.} \right\}, \\ i = 1,2,3,$ (1.6)
$\tilde {\lambda }^k=\lambda ^k-\beta \left( {\sum\limits_{j=1}^3 {A_j \tilde {x}_j^k } -b} \right).$ (1.7)

(2) 松弛迭代校正阶段, 由上步得到的$\left( {\tilde {x}_1^k, \tilde {x}_2^k ,\tilde {x}_3^k ,\tilde {\lambda }^k} \right)$, 求

$\left( {\begin{array}{l} x_1^{k+1} \\ x_2^{k+1} \\ x_3^{k+1} \\ \lambda ^{k+1} \\ \end{array}} \right)=\left( {\begin{array}{l} x_1^k \\ x_2^k \\ x_3^k \\ \lambda ^k \\ \end{array}} \right)-\alpha \left( {\begin{array}{l} x_1^k -\tilde {x}_1^k \\ x_2^k -\tilde {x}_2^k \\ x_3^k -\tilde {x}_3^k \\ \lambda ^k-\tilde {\lambda }^k \\ \end{array}} \right).$ (1.8)

然而, 在实际问题中, 类似于结构型(二个可分离算子)优化问题求解过程所遇到的, 精确求解子问题(1.6)往往花费很大.因此, 本文通过子问题的线性化及增加一邻近点项, 将其转化为强单调的线性变分不等式问题, 相对而言, 求解容易很多.

内容安排如下: 第2节介绍了分析用到的预备知识; 在第3节给出改进的新算法的迭代框架; 第4节证明了这种新算法的全局收敛性; 在第5节, 作了一些简要总结.

2 预备知识
2.1 基本标记及概念

$\left\| \cdot \right\|_p $表示标准的$p$范数, 令$\left\| \cdot \right\|:=\left\| \cdot \right\|_2 $表示欧几里得范数.对于一个正定对称矩阵$G$, 定义$\left\| \cdot \right\|_G $$G$ -范数, 比如$\left\| x \right\|_G :=\sqrt {x^TGx} $.

定义 2.1$f:R^n\to \left( {-\infty ,+\infty } \right)$. 若$\forall x\in R^n,y\in R^n$, 有

$ f\left( {tx+\left( {1-t} \right)y} \right)\le tf\left( x \right)+\left( {1-t} \right)f\left( y \right), $

$\forall t\in \left[ {0,1} \right]$都成立, 则称$f$是凸函数.

定义 2.2 对于$\mbox{VI}\left( {\Omega \mbox{,}F} \right)$中从$R^n$$R^n$的映射$F$, 即$F$满足

$\left( {F\left( u \right)-F\left( v \right)} \right)^T\left( {u-v} \right)\ge 0,\forall u,v\in R^n,$

则称$F$$R^n$上是单调的, 且变分不等式是单调的; 若存在正常数$\eta $使得

$\left( {u-v} \right)^T\left[ {F\left( u \right)-F\left( v \right)} \right]\ge \eta \left\| {u-v} \right\|^2, \forall u,v\in \Omega $

成立, 则称$F$$\Omega $上是强单调的.

定义 2.3 由算法产生的预测-校正迭代序列$\left\{ {w^k} \right\}$满足收缩性质

$\left\| {w^{k+1}-w^\ast } \right\|^2\le \left\| {w^k-w^\ast } \right\|^2-\alpha \left( {2-\alpha } \right)\left\| {w^k-\tilde {w}^k} \right\|^2,$

称这样的序列$\left\{ {w^k} \right\}$$F$单调的, 其中$\alpha $为校正过程中的松弛因子.

2.2 变分不等式的等价表述

求解模型 (1.1)的问题等价于寻找点集$\Omega ^\ast =\left( {x_1^\ast ,x_2^\ast ,x_3^\ast ,\lambda ^\ast } \right)\in \Omega $, 满足

$\left\{ {\begin{array}{l} \theta _1 \left( {x_1 } \right)-\theta _1 \left( {x_1^\ast } \right)+\left( {x_1 -x_1^\ast } \right)^T\left( {-A_1^T \lambda ^\ast } \right)\ge 0, \\ \theta _2 \left( {x_2 } \right)-\theta _2 \left( {x_2^\ast } \right)+\left( {x_2 -x_2^\ast } \right)^T\left( {-A_2^T \lambda ^\ast } \right)\ge 0, \\ \theta _3 \left( {x_3 } \right)-\theta _3 \left( {x_3^\ast } \right)+\left( {x_3 -x_3^\ast } \right)^T\left( {-A_3^T \lambda ^\ast } \right)\ge 0, \\ \left( {\lambda -\lambda ^\ast } \right)^T\left( {\sum\limits_{i=1}^3 {A_i x_i^\ast } -b} \right)\ge 0, \\ \end{array}} \right. \forall \left( {x_1 ,x_2 ,x_3 ,\lambda } \right)\in \Omega ,$ (2.1)

则一阶优化条件(2.1)可写成紧凑形式

$w^\ast \in \Omega ,\theta \left( u \right)-\theta \left( {u^\ast } \right)+\left( {w-w^\ast } \right)^TF\left( {w^\ast } \right)\ge 0,\forall w\in \Omega.$ (2.2)

注意到$F$是单调的, 且$\Omega ^\ast $表示变分不等式(2.2)的解集, 其中 $ \Omega =X_1 \times X_2 \times X_3 \times R^m. $

$ u=\left( {x_1 ,x_2 ,x_3 } \right)^T, F\left( w \right)=\left( {-A_1^T \lambda ,-A_2^T \lambda ,-A_3^T \lambda ,\sum\limits_{i=1}^3 {A_i x_i -b} } \right)^T, \\ w=\left( {x_1 ,x_2 ,x_3 ,\lambda } \right)^T, \theta \left( u \right)=\theta _1 \left( {x_1 } \right)+\theta _2 \left( {x_2 } \right)+\theta _3 \left( {x_3 } \right).$

通篇假设:

1. 问题(1.2)的解集$\Omega ^\ast $是非空的;

2. 假设对给定的凸函数 ${\theta _i}({x_i}){(_i} = 1,2,3)$$\forall \gamma > 0,a \in {R^n}$, 子问题

$\min \left\{ {{\theta _i}({x_i}) + \frac{\gamma }{2}{{\left\| {{A_i}{x_i} - a} \right\|}^2}\left| {{x_i} \in {X_i}} \right.} \right\},i = 1,2,3$
3 算法描述
3.1 新方法迭代框架

这一节主要给出了基于三个可分离算子定制邻近点算法 (proximal point algorithm, PPA) 的线性化算法:

线性化(1.6)在$x^k$处的二次方项

$\left\{ {\frac{\beta }{2\times 3}\left\| {A_i \left( {3x_i -2x_i^k } \right)+\sum\limits_{j=1,j\ne i}^3 {A_j x_j^k } -b-\frac{1}{\beta }\lambda ^k} \right\|^2} \right\},$

并在目标函数增加一邻近点项$\frac{r}{2}\left\| {x_i -x_i^k } \right\|^2\left( {i=1,2,3} \right)$,其中$r$为参数.线性化后, 由给定的$\left( {x_1^0 ,x_2^0 ,x_3^0 ,\lambda ^0} \right)$得到

(1) 预测序列

$\tilde {x}_i^k ={\hbox{Arg}}\min \left\{ {\theta _i \left( {x_i } \right)+\beta x_i^T A_i^T \left( {\sum\limits_{j=1}^3 {A_i x_j^k } -b-\frac{1}{\beta }\lambda ^k} \right)+\frac{r}{2}\left\| {x_i -x_i^k } \right\|^2\left| {x_i \in X_i } \right.} \right\},\nonumber\\ i=1,2,3,$ (3.1)
$\tilde {\lambda }^k=\lambda ^k-\beta \left( {\sum\limits_{j=1}^3 {A_j \tilde {x}_j^k } -b} \right).$ (3.2)

(2) 校正序列

$\left( {\begin{array}{l} x_1^{k+1} \\ x_2^{k+1} \\ x_3^{k+1} \\ \lambda ^{k+1} \\ \end{array}} \right)=\left( {\begin{array}{l} x_1^k \\ x_2^k \\ x_3^k \\ \lambda ^k \\ \end{array}} \right)-\alpha \left( {\begin{array}{l} x_1^k -\tilde {x}_1^k \\ x_2^k -\tilde {x}_2^k \\ x_3^k -\tilde {x}_3^k \\ \lambda ^k-\tilde {\lambda }^k \\ \end{array}} \right).$ (3.3)

(3) 参数要求 $\alpha \in \left( {0,2} \right)$, 对给定的$\beta >0$, 参数$r$满足 $rI-\beta A^TA\ge 0$, 其中$A=\left( {A_1 ,A_2 ,A_3 } \right)$.

注记 此处校正步长$\alpha $是固定不变的, $\alpha $的选取对算法收敛及收敛速度都有重要影响, 故允许$\alpha $在迭代过程中自适应地调节是可取的.

3.2 变分不等式的等价表述

公式(3.1), (3.2)等价于求解变分不等式

$\theta \left( u \right)-\theta \left( {\tilde {u}^k} \right)+\left( {w-\tilde {w}^k} \right)^T\left\{ {F\left( {\tilde {w}^k} \right)+Q\left( {\tilde {w}^k-w^k} \right)} \right\}\ge 0,$ (3.4)

$u=\left( {x_1 ,x_2 ,x_3 } \right)^T$, $w=\left( {x_1 ,x_2 ,x_3 ,\lambda } \right)^T,$ $F\left( w \right)=\left( {-A_1^T \tilde {\lambda }^k,-A_2^T \tilde {\lambda }^k,-A_3^T \tilde {\lambda }^k,\sum\limits_{j=1}^3 {A_j \tilde {x}_j^k -b} } \right)^T$,

$ Q=\left( {{\begin{array}{*{20}c} {rI_{n_1 } -\beta A_1^T A_1 } \hfill & {-\beta A_1^T A_2 } \hfill & {-\beta A_1^T A_3 } \hfill & 0 \hfill \\ {-\beta A_2^T A_1 } \hfill & {rI_{n_2 } -\beta A_2^T A_2 } \hfill & {-\beta A_2^T A_3 } \hfill & 0 \hfill \\ {-\beta A_3^T A_1 } \hfill & {-\beta A_3^T A_2 } \hfill & {rI_{n_3 } -\beta A_3^T A_3 } \hfill & 0 \hfill \\ 0 \hfill & 0 \hfill & 0 \hfill & {\frac{1}{\beta }I_m } \hfill \\ \end{array} }} \right) $

为半正定矩阵, 且显然$F$为单调的.

分析: 公式(3.1)等价于求解变分不等式

$\theta _i \left( {x_i } \right)-\theta _i \left( {\tilde {x}_i^k } \right)+\left( {x_i -\tilde {x}_i^k } \right)\left\{ {\beta A_i^T \left( {\sum\limits_{j=1}^3 {A_j x_j^k } -b-\frac{1}{\beta }\lambda ^k} \right)+r\left( {x_i -x_i^k } \right)} \right\}\ge 0, i=1,2,3,$

借用公式(3.2), 上式可化为

$\theta _i \left( {x_i } \right)-\theta _i \left( {\tilde {x}_i^k } \right)+\left( {x_i -\tilde {x}_i^k } \right)\left\{ {\beta A_i^T \left( {-\sum\limits_{j=1}^3 {A_j \left( {\tilde {x}_j^k -x_j^k } \right)} -\frac{1}{\beta }\tilde {\lambda }^k} \right)+r\left( {x_i -x_i^k } \right)} \right\}\ge 0, i=1,2,3,$

$\forall x_i \in X_i $, 有$\tilde {x}_i^k \in X_i $, 则上式化为

$\theta _i \left( {x_i } \right)-\theta _i \left( {\tilde {x}_i^k } \right)+\left( {x_i -\tilde {x}_i^k } \right)\left\{ {-A_i^T \tilde {\lambda }^k-\beta A_i^T \sum\limits_{j=1}^3 {A_j \left( {\tilde {x}_j^k -x_j^k } \right)+} r\left( {\tilde {x}_i^k -x_i^k } \right)} \right\}\ge 0, i=1,2,3. $

上述$i=1,2,3$的情况取加和, 并结合(3.2)式的等价变形形式

$\frac{1}{\beta }(\tilde {\lambda }^k-\lambda ^k)+\left( {\sum\limits_{j=1}^3 {A_j \tilde {x}_j^k } -b} \right)=0,$

$\theta \left( u \right)-\theta \left( {\tilde {u}^k} \right)+\left( {w-\tilde {w}^k} \right)^T\left\{ {F\left( {\tilde {w}^k} \right)+Q\left( {\tilde {w}^k-w^k} \right)} \right\}\ge 0. $ 其中

$w=\left( {x_1 ,x_2 ,x_3 ,\lambda } \right)^T, \\ F\left( w \right)=\left( {-A_1^T \tilde {\lambda }^k,-A_2^T \tilde {\lambda }^k,-A_3^T \tilde {\lambda }^k,\sum\limits_{j=1}^3 {A_j \tilde {x}_j^k -b} } \right)^T,\\ Q=\left( {{\begin{array}{*{20}c} {rI_{n_1 } -\beta A_1^T A_1 } \hfill & {-\beta A_1^T A_2 } \hfill & {-\beta A_1^T A_3 } \hfill & 0 \hfill \\ {-\beta A_2^T A_1 } \hfill & {rI_{n_2 } -\beta A_2^T A_2 } \hfill & {-\beta A_2^T A_3 } \hfill & 0 \hfill \\ {-\beta A_3^T A_1 } \hfill & {-\beta A_3^T A_2 } \hfill & {rI_{n_3 } -\beta A_3^T A_3 } \hfill & 0 \hfill \\ 0 \hfill & 0 \hfill & 0 \hfill & {\frac{1}{\beta }I_m } \hfill \\ \end{array} }} \right).$

下证对称矩阵$Q$为半正定矩阵.

因为参数$\beta >0$, 故只需证对称矩阵

$\mathop Q\limits^\sim =\left( {{\begin{array}{*{20}c} {rI_{n_1 } -\beta A_1^T A_1 } \hfill & {-\beta A_1^T A_2 } \hfill & {-\beta A_1^T A_3 } \hfill \\ {-\beta A_2^T A_1 } \hfill & {rI_{n_2 } -\beta A_2^T A_2 } \hfill & {-\beta A_2^T A_3 } \hfill \\ {-\beta A_3^T A_1 } \hfill & {-\beta A_3^T A_2 } \hfill & {rI_{n_3 } -\beta A_3^T A_3 } \hfill \\ \end{array} }} \right)$

半正定即可.

$\mathop Q\limits^\sim =rI_n -\beta \left( {\begin{array}{l} A_1^T \\ A_2^T \\ A_3^T \\ \end{array}} \right)\left( {A_1 ,A_2 ,A_3 } \right)$, 其中$n=n_1 +n_2 +n_3 $. 记矩阵$A=\left( {A_1 ,A_2 ,A_3 } \right)$, 则矩阵$\mathop Q\limits^\sim =rI_n -\beta A^TA$.$rI-\beta A^TA\ge 0$, 由参数的取值范围知矩阵$Q$为半正定阵显然成立.

综上可知本结论得证.

注 1 一般保守估计$r\ge \beta \lambda _{\max } \left( {A^TA} \right)$.

注 2 在后面的部分, 该性质对算法的收敛性证明很有帮助.

4 收敛性证明

引理 4.1 变分不等式(2.2)的解集是凸集, 且可表示为

$\label{eq1} \Omega ^\ast =\mathop \cap \limits_{w\in \Omega } \left\{ {\tilde {w}\in \Omega :\theta \left( u \right)-\theta \left( {\tilde {u}} \right)+\left( {w-\tilde {w}} \right)^TF\left( w \right)\ge 0} \right\}.$ (4.1)

(1) ($\Rightarrow )$$\tilde {w}\in \Omega ^\ast $, 有

$ \theta \left( u \right)-\theta \left( {\tilde {u}} \right)+\left( {w-\tilde {w}} \right)^T\ge 0, \forall w\in \Omega,$

借助$F$$\Omega $上的单调性, 可以得出 $ \theta \left( u \right)-\theta \left( {\tilde {u}} \right)+\left( {w-\tilde {w}} \right)^TF\left( w \right)\ge 0, \forall w\in \Omega$, 因此 $\tilde {w}$属于(4.1)式右边的集合.

(2) ($\Leftarrow )$ 设对$\forall w\in \Omega $, 对所有的$\alpha \in \left( {0,1} \right)$, 矢量$\bar {w}=\alpha \tilde {w}+\left( {1-\alpha } \right)w\in \Omega $, 因此有

$ \theta \left( {\bar {u}} \right)-\theta \left( {\tilde {u}} \right)+\left( {\bar {w}-\tilde {w}} \right)^TF\left( {\bar {w}} \right)\ge 0.$ (4.2)

因为$\theta \left( \bullet \right)$是凸的, 故 $ \theta \left( {\bar {u}} \right)\le \alpha \theta \left( {\tilde {u}} \right)+\left( {1-\alpha } \right)\theta \left( u \right) $ 将上式代入(4.2)式, 则对所有的$\alpha \in \left( {0,1} \right)$, 有

$ \left( {1-\alpha } \right)\left[ {\theta \left( u \right)-\theta \left( {\tilde {u}} \right)} \right]+\left( {1-\alpha } \right)\left( {w-\tilde {w}} \right){ }^TF\left( {\bar {w}} \right)\ge 0,$

则易得

$\theta \left( u \right)-\theta \left( {\tilde {u}} \right)+\left( {w-\tilde {w}} \right)^TF\left( {\alpha \tilde {w}+\left( {1-\alpha } \right)w} \right)\ge 0.$

$\alpha \to 1$可得 $\theta \left( u \right)-\theta \left( {\tilde {u}} \right)+\left( {w-\tilde {w}} \right)^TF\left( {\tilde {w}} \right)\ge 0$.

因此$\tilde {w}\in \Omega ^\ast $. 综合(1), (2)知, 变分不等式的解集可表示为

$\Omega ^\ast =\mathop \cap \limits_{w\in \Omega } \left\{ {\tilde {w}\in \Omega :\theta \left( u \right)-\theta \left( {\tilde {u}} \right)+\left( {w-\tilde {w}} \right)^TF\left( w \right)\ge 0} \right\}.$

(3) 下证$\Omega ^\ast $是凸集.

$\forall w\in \Omega $, 集合 $ \left\{ {\tilde {w}\in \Omega :\theta \left( {\tilde {u}} \right)+\tilde {w}^TF\left( w \right)\le \theta \left( u \right)+w^TF\left( w \right)} \right\} $ 等价于

$\left\{ {\tilde {w}\in \Omega :\theta \left( u \right)-\theta \left( {\tilde {u}} \right)+\left( {w-\tilde {w}} \right)^TF\left( w \right)\ge 0} \right\},$

因为任意个凸集的交仍然为凸集, 故集合$\Omega ^\ast $是凸集得证.

综合(1)--(3)知本引理得证.

引理 4.2 由(3.1)式产生的序列$\left\{ {w^k} \right\}$满足 $\forall w^\ast \in W^\ast $,

$\left( {\tilde {w}^k-w^\ast } \right)^TQ\left( {w^k-\tilde {w}^k} \right)\ge 0.$ (4.3)

$w^\ast \in \Omega ^\ast \subseteq \Omega $, 其中$\Omega ^\ast $, $\Omega $分别为不等式的解集及定义域, (3.2)式中令 $w=w^\ast $, 有

$\left( {\tilde {w}^k-w^\ast } \right)^TQ\left( {w^k-\tilde {w}^k} \right)\ge \left( {\tilde {w}^k-w^\ast } \right)^TF\left( {\tilde {w}^k} \right)+\theta \left( {\tilde {u}^k} \right)-\theta \left( {u^\ast } \right)\ge 0, \forall w^\ast \in \Omega ^\ast.$

借助$F$的单调性$\left( {\tilde {w}^k-w^\ast } \right)^T\left( {F\left( {\tilde {w}^k} \right)-F\left( {w^\ast } \right)} \right)\ge 0$, 及$\tilde {w}^k\in \Omega $, $w^\ast \in \Omega ^\ast $, 有

$\left( {\tilde {w}^k-w^\ast } \right)^TF\left( {\tilde {w}^k} \right)+\theta \left( {\tilde {u}^k} \right)-\theta \left( {u^\ast } \right)\ge \left( {\tilde {w}^k-w^\ast } \right)^TF\left( {w^\ast } \right)+\theta \left( {\tilde {u}^k} \right)-\theta \left( {u^\ast } \right)\ge 0.$

因此有

$\left( {\tilde {w}^k-w^\ast } \right)^TQ\left( {w^k-w^k} \right)\ge 0, \forall w^\ast \in \Omega ^\ast .$

因此 可得到(4.3)式.

引理 4.3 由 (3.1)式产生的序列$\left\{ {w^k} \right\}$满足 $\forall w^\ast \in W^\ast $,

$\left( {w^k-w^\ast } \right)^TQ\left( {w^k-\tilde {w}^k} \right)\ge \left\| {w^k-\tilde {w}^k} \right\|_Q^2 .$ (4.4)

由引理4.2$\left( {\tilde {w}^k-w^\ast } \right)^TQ\left( {w^k-\tilde {w}^k} \right)\ge 0$

$ \left( {\tilde {w}^k-w^k+w^k-w^\ast } \right)^TQ\left( {w^k-\tilde {w}^k} \right)\ge 0, $

$\left( {w^k-w^\ast } \right)^TQ\left( {w^k-\tilde {w}^k} \right)\ge \left\| {w^k-\tilde {w}^k} \right\|_Q^2 $, 即本结论得证.

借助引理4.3, 可得到一个重要的不等式.

引理 4.4 由(3.1)式产生的序列$\left\{ {w^k} \right\}$满足

$\left\| {w^{k+1}-w^\ast } \right\|_Q^2 \le \left\| {w^k-w^\ast } \right\|_Q^2 -\alpha \left( {2-\alpha } \right)\left\| {w^k-\tilde {w}^k} \right\|_Q^2.$ (4.5)

由迭代式(3.3)得$\forall w^\ast \in W^\ast $,

$\left\| {w^k-w^\ast } \right\|_Q^2 -\left\| {w^{k+1}-w^\ast } \right\|_Q^2 =\left\| {w^k-w^\ast } \right\|_Q^2 -\left\| {\left( {w^k-w^\ast } \right)-\alpha \left( {w^k-\tilde {w}^k} \right)} \right\|_Q^2 \\ = 2\alpha \left( {w^k-w^\ast } \right)^TQ\left( {w^k-\tilde {w}^k} \right)-\alpha ^2\left\| {w^k-\tilde {w}^k} \right\|_Q^2.$

回顾(4.4)式, 则由上述不等式可得

$\left\| {w^k-w^\ast } \right\|_Q^2 -\left\| {w^{k+1}-w^\ast } \right\|_Q^2 \ge \alpha \left( {2-\alpha } \right)\left\| {w^k-\tilde {w}^k} \right\|_Q^2 , \forall w^\ast \in W^\ast,$ (4.6)

即引理论断得证.

注 3 由公式(4.6), 容易知道在迭代步(3.3)要求参数$\alpha \in \left( {0,1} \right)$, 是为了保证迭代列$\left\{ {w^k} \right\}$收敛到$W^\ast $.

现在, 借助引理4.2, 提出迭代步(3.3)对于算法收敛性是有效的.换句话说, 在$Q$ -范数意义下, 该步生成的新迭代$w^{k+1}$$w^k$更靠近集合$W^\ast $, 使得(3.1)式产生的序列$\left\{ {w^k} \right\}$的收敛性真实可靠.

定理 4.5$\left\{ {w^k} \right\}$$\left\{ {\tilde {w}^k} \right\}$是定制的邻近点交替方向线性化算法(3.1)产生的序列, 则有

(1) 序列$\left\{ {w^k} \right\}$是有界的;

(2) $\mathop {\lim }\limits_{k\to \infty } \left\| {w^k-\tilde {w}^k} \right\|^2=0$;

(3) $\left\{ {\tilde {w}^k} \right\}$的聚点是变分不等式(2.2)的解点;

(4) 聚点唯一.

(1) 由引理4.4知结论1显然成立, 因为有界点列一定有聚点. 故不妨记$w^\infty $$\left\{ {w^k} \right\}$的一个聚点.

(2) 将引理4.4中的式(4.5)式无穷迭代序列情况$k=0,1,2,\cdots $作加和, 可得到

$\sum\limits_{k=0}^\infty {\alpha \left( {2-\alpha } \right)} \left\| {{\rm w}}^k-{\rm {\bf \tilde {w}}}^k\right\|^2\le \left\| {{\rm w}}^0-{\rm {\bf w}}^\ast \right\|^2,$

$\alpha \left( {2\mbox{-}\alpha } \right)>0$, 故$ \mathop {\lim }\limits_{k\to \infty } \left\| {w}^k-{\rm {\bf \tilde {w}}}^k\right\|^2=0, $$w^\infty $也是$\left\{ {\tilde {w}^k} \right\}$的一个聚点. 又因为定义域$\Omega $为闭凸集, 则$w^\infty \in \Omega $.

(3) 由$\tilde {w}^k\in W,$

$\mathop {\lim }\limits_{k\to \infty } \left\{ {\theta \left( u \right)-\theta \left( {\tilde {u}^k} \right)+\left( {w-\tilde {w}^k} \right)^TF\left( {\tilde {w}^k} \right)} \right\}\ge 0,\forall w\in W,$

显然$\left\{ {\tilde {w}^k} \right\}$的聚点是变分不等式(2.2)的解点, 即为模型(1.1)的解点. 则 $w^\infty $为变分不等式的解点.

(4) 依引理4.4知$\left\{ {w^k} \right\}$$F$单调的且$\mathop {\lim }\limits_{k\to \infty } \left\| {w^k-\tilde {w}^k} \right\|^2=0$, 故序列$\left\{ {\tilde {w}^k} \right\}$的聚点唯一[23] , 且$\left\{ {\tilde {w}^k} \right\}$收敛到$w^\infty \in \Omega ^\ast $.

故本定理得证.

5 总结

对于三个可分离算子无交叉变量的线性约束凸优化问题, 本文在定制的邻近点交替方向法的基础上提出了线性化算法, 并证明了算法的全局收敛性.注意到在提出的算法中罚参数$\beta $及迭代中的校正步长$\alpha $均为固定值, 因此如何使用变化的参数以提高算法的效率, 是下一步研究的问题.

参考文献
[1] Tibshirani R, Saunders M, Rosset S, Zhu J, Knight K. Sparsity and smoothness via the fused lasso[J]. J. Royal Stat. Soc. Ser. Bist. Meth., 2005, 67: 91–108. DOI:10.1111/rssb.2005.67.issue-1
[2] Tao Min, Yuan XiaoMing. Recovering low-rank and sparse components of matrices from incomncomplete andnoisy observations[J]. Siam J. Contr. Opti., 2011, 21: 57–81. DOI:10.1137/100781894
[3] Fukushima M. Application of the alternating directions method of multipliers to separable con- vex programming problems[J]. Comp. Opti. Appl., 1992, 2: 93–111.
[4] Hestenes M. Multiplier and gradient methods[J]. J. Opti. The. Appl., 1969, 4: 303–320. DOI:10.1007/BF00927673
[5] Powell M. A method for nonlinear constraints in minimization problems[M]. New York: UKAEA, 1967.
[6] Gabay D, Mercier B. A dual algorithm for the solution of nonlinear variational problems via flnite element approximations[J]. Comp. Math. Appl., 1976, 2: 17–40. DOI:10.1016/0898-1221(76)90003-1
[7] Glowinski R, Marrocco A. Approximation paréléments flnis d’ ordre un et résolution parpénalis ation-dualité d’ une classe de problémes non linéaires[J]. RAIRO. Anal. Numér, 1975, 9(2): 41–76.
[8] Chen G, Teboulle M. A proximal-based decomposition method for convex minimization problems[J]. Math. Prog., 1994, 64: 81–101. DOI:10.1007/BF01582566
[9] Fukushima M. Application of the alternating directions method of multipliers to separable convex programming problems[J]. Comp. Opti. Appl., 1992, 2: 93–111.
[10] Glowinski R, Le Tallec P. Augmented lagrangian and operator-splitting methods in nonlinear mechanics[M]. Philadelphia: SIAM Stud. Appl. Math., 1989.
[11] He Bingsheng, Liao Lizhi, Han Deren, Yang Hai. A new inexact alternating direction method for monotone variational inequalities[J]. Math. Prog., 2002, 92: 103–118. DOI:10.1007/s101070100280
[12] Kontogiorgis S, Meyer R. A variable-penalty alternating directions method for convex optimization[J]. Math. Prog., 1998, 83: 29–53.
[13] Boyd S, Parikh N, Chu E, Peleato B, Eckstein J. Distributed optimization and statistical learning via the alternating direction method of multipliers[J]. Found. Trends Mach. Learn, 2010, 3: 1–122. DOI:10.1561/2200000016
[14] Chen Caihua, He Bingsheng, Yuan Xingming. Matrix completion via alternating direction method[J]. SIAM J. Numer. Anal., 2012, 32: 227–245.
[15] Esser E. Applications of lagrangian-based alternating direction methods and connections to split bregman[J]. UCLA CAM Report, 2009, 9.
[16] He Bingsheng, Yuan Xiaoming. Solving large-scale least squares covariance matrix problems by alternating direction methods[J]. SIAM J. Matrix Anal. Appl., 2011, 32: 136–152. DOI:10.1137/090768813
[17] He Bingsheng, Tao Min, Yuan Xiaoming. Alternating direction method with Gaussian back substitution for separable convex programming[J]. SIAM J. Opti., 2012, 22: 313–340. DOI:10.1137/110822347
[18] He Bingsheng, Tao Min, Yuan Xingming. An alternating direction-based contraction method for linearly constrained separable convex programming problems[J]. Opti., 2013, 62(4): 573–596. DOI:10.1080/02331934.2011.611885
[19] He Bingsheng, Tao Min, Yuan Xiaomng. A splitting method for separable convex programming[J]. IMA J. Numer. Anal., 2014.
[20] Peng Y, Ganesh A, Wright J. Robust alignment by sparse and low-rank decomposition for linearly correlated images[J]. Pattern Anal. Machine Intel., IEEE Trans., 2012, 34(11): 2233–2246. DOI:10.1109/TPAMI.2011.282
[21] Gu Guoyong, He Bingsheng, Yuan Xiaoming. Customized proximal point algorithms for linearly contrained convex minimization and saddle-point problems: a uniform approach[J]. Comp. Opti. Appl, 2011: 1–27.
[22] Blum E, Oettli W. Mathematische optimierung, grundlagen und verfahren, Ökonometrie und Un-ternehmensforschung[M]. Berlin-Heidelberg-New York: Springer-Verlag, 1975.
[23] Bauschke H H, Combettes P L. A weak-to-strong convergence principle for Fejér-monotone methods in Hilbert spaces[J]. Math Oper. Res., 2001, 26(2): 248–264. DOI:10.1287/moor.26.2.248.10558
[24] Han Deren. Inexact operator splitting methods with self-adaptive strategy for variational inequality problems[J]. J. Opti. The. Appl., 2007, 132: 227–243. DOI:10.1007/s10957-006-9060-5
[25] Han Deren, Xu Wei, Yang Hai. An operator splitting method for variational inequalities with partially unknown mappings[J]. Numer. Math., 2008, 111: 207–237. DOI:10.1007/s00211-008-0181-7
[26] He Bingsheng, Liao L Z, Wang S L. Self-adaptive operator splitting methods for monotone variational inequalities[J]. Numer. Math., 2003, 94: 715–737. DOI:10.1007/s00211-002-0408-y