数学杂志  2015, Vol. 35 Issue (5): 1275-1286   PDF    
扩展功能
加入收藏夹
复制引文信息
加入引用管理器
Email Alert
RSS
本文作者相关文章
杨家稳
孙合明
矩阵方程AXB+CXTD=E自反最佳逼近解的迭代算法
杨家稳1, 孙合明2    
1. 滁州职业技术学院基础部, 安徽 滁州 239000;
2. 河海大学理学院, 江苏 南京 210098
摘要:本文研究了Sylvester矩阵方程AXB+CXTD=E自反 (或反自反) 最佳逼近解.利用所提出的共轭方向法的迭代算法, 获得了一个结果:不论矩阵方程AXB+CXTD=E是否相容, 对于任给初始自反 (或反自反) 矩阵X1, 在有限迭代步内, 该算法都能够计算出该矩阵方程的自反 (或反自反) 最佳逼近解.最后, 三个数值例子验证了该算法是有效性的.
关键词Sylvester矩阵方程    Kronecker积    共轭方向法    最佳逼近解    自反矩阵    
AN ITERATIVE ALGORITHM FOR THE REFLEXIVE OPTIMAL APPROXIMATION SOLUTION OF MATRIX EQUATIONS AXB + CXTD=E
YANG Jia-wen1, SUN He-ming2    
1. Department of Basic, Chuzhou Vocational and Technical College, Chuzhou 239000, China;
2. College of Science, Hohai University, Nanjing 210098, China
Abstract: In this paper, we study the optimal approximation solutin of the Sylvester matrix equations AXB + CXTD=E over reflexive (anti-reflexive) matrices. By using the proposed conjugate direction method, we get a result that whatever matrix equations AXB + CXTD=E are consistent or not, for arbitrary initial reflexive (anti-reflexive) matrix X1, the reflexive (anti-reflexive) optimal approximation solution can be obtained within finite iteration steps in the absence of round-off errors. The effectiveness of the proposed algorithm is verified by three numerical examples.
Key words: sylvester matrix equations     Kronecker product     conjugate direction method     optimal approximation solution     reflexive matrix    
1 引言

首先介绍文中的符号:${{\Re }^{m\times n}}$表示$m\times n$阶实矩阵的集合, $I$表示单位矩阵, ${{A}^{T}}$表示矩阵$A$的转置, $A\otimes B$表示矩阵$A$与矩阵$B$的Kronecker积. $\left\langle A,B\right\rangle ={\hbox{ trace}}({{B}^{T}}A)$, $\left\| A \right\|$表示矩阵$A$的Frobenius范数, ${{\left\| A \right\|}^{2}}=\left\langle A, A \right\rangle$, ${{\left\| \alpha \right\|}_{2}}$表示向量$\alpha $$2$-范数, $\left\| \alpha \right\|_{2}^{2}=\left\langle \alpha, \alpha \right\rangle$.若${{P}^{2}}=I$${{P}^{T}}=P$, 则称实矩阵$P$是反射矩阵.若$A=PAP$$A=-PAP$, 则称矩阵$A$为关于反射$P$的自反 (或反自反) 矩阵, 记${{R}_{r}}^{n\times n}(P) = \{A\in {{R}^{n\times n}}|PAP=A\}$.若vec$({{X}^{T}})=Q(m,n){\hbox{vec}}(X)$, 其中$X\in {{R}^{m\times n}}$, 则称$Q(m,n)$为置换矩阵, 且

$Q(m,n)=Q{{(n,m)}^{T}}=Q{{(n,m)}^{-1}}.$

矩阵$A=({{a}_{ij}})\in {{R}^{m\times n}}$的拉伸算子${\hbox{vec}}(\cdot)$

$ {\hbox{vec}}(A)=\left[{{a}_{11}},{{a}_{21}},\cdots ,{{a}_{m1}},{{a}_{12}},\right. \left.{{a}_{22}},\cdots ,{{a}_{m2}},\cdots , {{a}_{1n}},{{a}_{2n}},\cdots ,{{a}_{mn}}\right]^{T}, $

$\nabla f(X )$表示$f(X)$关于$X$的梯度算子, 即$\nabla f(X )=$ ${{\left( \frac{\partial f}{\partial {{x}_{11}}},\frac{\partial f}{\partial {{x}_{21}}},\cdots ,\frac{\partial f}{\partial {{x}_{mn}}} \right)}^{T}}$, ${{\nabla }^{2}}f(X )$为Hessian矩阵, 其中$X=({{x}_{ij}})\in {{R}^{m\times n}}$.

我们知道关于反射矩阵的自反矩阵或反自反矩阵在系统与控制理论、工程、科学计算和其他领域都有广泛的应用[1-2].近年来, Sylvester矩阵方程$AXB+C{{X}^{T}}D=E$是计算数学研究的热点之一, 并且取得很多成果.若矩阵方程相容, 文献[3]利用Moore-Penrose广义给出了$AX+{{X}^{T}}C=B$解的表达式, 文献[4-5]用共轭梯度迭代算法给出矩阵方程$AXB+C{{X}^{T}}D=E$极小范数最小二乘解, 2011年文献[6]给出了$XA+A{{X}^{T}}=0$的一般解, 2012年文献[7]给出$AXB+C{{X}^{T}}D=E$的可解性, 文献[8]给出求解方程${{A}_{1}}{{X}_{1}}{{B}_{1}}=C$, ${{A}_{2}}{{X}_{2}}{{B}_{2}}=D$的自反 (或反自反) 最佳逼近解.若矩阵方程不相容, 对于任意给定矩阵, 盛[9]提出一种算法求得最佳逼近解, 2012年文献[10]提出一种算法求$AXB+CXD=E$的自反 (或反自反) 最佳逼近解.

目前还未见研究方程$AXB+C{{X}^{T}}D=E$的自反 (或反自反) 最佳逼近解的文献.一般来说, 较容易求得该矩阵方程无约束条件的解, 但很难求该矩阵方程的带约束条件的解.我们利用共轭方向法思想, 先构造一个可行方向, 然后根据这个可行方向再构造相互共轭的下降方向, 所以初始矩阵通过有限次迭代能够收敛于全局最小点.

本文考虑如下问题:

问题 Ⅰ给定矩阵$A, C\in {{\Re }^{m\times l}}$, $B, D\in {{\Re }^{l\times n}}$, $ E\in {{\Re }^{m\times n}}$, 求$X\in \Re _{r}^{l\times l}(P)$, 使得

$\left\|E-AXB-C{{X}^{T}}D\right\|=\min.$ (1.1)

问题 Ⅱ设问题I的解集合为${{S}_{E}}$, 给定${{X}^{*}}\in R_{r}^{l\times l}(P)$, 求$\hat{X}\in {{S}_{E}}$使得

$\left\| {\hat X - {X^*}} \right\| = \mathop {\min }\limits_{X \in {S_E}} {\mkern 1mu} \left\| {X - {X^*}} \right\|.$ (1.2)

对于矩阵方程$AXB+C{{X}^{T}}D=E$, 若它们是相容的, ${{S}_{E}}$是解的集合; 若它们不相容, ${{S}_{E}}$就是最小二乘解的集合.所以无论矩阵方程$AXB+C{{X}^{T}}D=E$是否相容, 都可以认为${{S}_{E}}$是问题Ⅰ的解集合.问题Ⅱ是在${{S}_{E}}$里找一个与矩阵${{X}^{*}}\in R_{r}^{l\times l}(P)$最接近的矩阵.由于数据不完整或者修改数据, 在线性系统的测试和复原过程中, 会提出问题Ⅱ, 未知矩阵的估计值${{X}^{*}}$可以通过实验观测和统计分布的信息获得.

显然问题I的最小二乘解等价 (1.3) 式的最小二乘解

$\left\|E-AXB-C{{X}^{T}}D \right\|^{2}=\min,$ (1.3)

其中$X\in \Re _{r}^{l\times l}\left( P \right)$.

本文的结构如下:第2部分给出问题Ⅰ和问题Ⅱ的一个迭代算法并证明所给算法是收敛的; 第3部分用三个数值例子来验证该算法的有效性; 第4部分给出结论.

2 问题I和II的迭代算法

下面给出问题I和Ⅱ的算法, 算法的具体步骤如下:

步骤1 输入$A\in {{\Re }^{m\times l}}$, $B\in {{\Re }^{l\times n}}$, $C\in {{\Re }^{m\times l}}$, $D\in {{\Re }^{l\times n}}$$E\in {{\Re }^{m\times n}}$;

步骤2 任给自反矩阵${{X}_{1}}\in \Re _{r}^{l\times l}(P)$, 其中$P\in \Re _{r}^{l\times l}$是任意广义自反矩阵;

步骤3 计算

${{R}_{1}}=E-A{{X}_{1}}B-C{{X}_{1}}^{T}D; \\ {{g}_{1}}={{A}^{T}}{{R}_{1}}{{B}^{T}}+D{{R}_{1}}^{T}C+P({{A}^{T}}{{R}_{1}}{{B}^{T}}+D{{R}_{1}}^{T}C)P; \\ {{V}_{1}}={{A}^{T}}A{{g}_{1}}B{{B}^{T}}+{{A}^{T}}C{{g}_{1}}^{T}D{{B}^{T}}+D{{B}^{T}}{{g}_{1}}^{T}{{A}^{T}}C +D{{D}^{T}}{{g}_{1}}{{C}^{T}}C ; \\ {{d}_{1}}={{V}_{1}}+P{{V}_{1}}P;\\ k:=1;$

步骤4 如果${{g}_{1}}\text{=}\mathbf{0},$则停止迭代; 否则转入步骤5;

步骤5 计算$ {\hbox{vec}}({{X}_{k+1}})={\hbox{vec}}({{X}_{k}})+\frac{{{\left\| {{g}_{k}} \right\|}^{2}}}{{{\left\| {{d}_{k}} \right\|}^{2}}}{\hbox{vec}}({{d}_{k}}),$${{X}_{k+1}}={{X}_{k}}+\frac{{{\left\| {{g}_{k}} \right\|}^{2}}}{{{\left\| {{d}_{k}} \right\|}^{2}}}{{d}_{k}};$

${{U}_{k}}={{A}^{T}}A{{d}_{k}}B{{B}^{T}}+{{A}^{T}}C{{d}_{k}}^{T}D{{B}^{T}}+D{{B}^{T}}{{d}_{k}}^{T}{{A}^{T}}C+D{{D}^{T}}{{d}_{k}}{{C}^{T}}C;\\ {{g}_{k+1}}={{A}^{T}}{{R}_{k+1}}{{B}^{T}}+D{{R}_{k+1}}^{T}C+P({{A}^{T}}{{R}_{k+1}}{{B}^{T}}+D{{R}_{k+1}}^{T}C)P\\ ={{g}_{k}}-\frac{{\left\| {{g}_{k}} \right\|}^{2}}{{{\left\| {{d}_{k}} \right\|}^{2}}}({{U}_{k}}+P{{U}_{k}}P);\\ {{V}_{k+1}}={{A}^{T}}A{{g}_{k+1}}B{{B}^{T}}+{{A}^{T}}C{{g}_{k+1}}^{T}D{{B}^{T}}+D{{B}^{T}}{{g}_{k+1}}^{T}{{A}^{T}}C+D{{D}^{T}}{{g}_{k+1}}{{C}^{T}}C;\\ {{d}_{k+1}}={{V}_{k+1}}+P{{V}_{k+1}}P +\frac{{{\left\| {{g}_{k+1}} \right\|}^{2}}}{{{\left\| {{g}_{k}} \right\|}^{2}}}{{d}_{k}}.$

步骤6 如果${{g}_{k+1}}=\mathbf{0}$, 则停止迭代; 否则$k:=k+1$, 转入步骤5.

为了证明该算法的收敛性, 首先给出下面一些定理.

定理1 设$F(X)={{\left\| E-AXB-C{{X}^{T}}D \right\|}^{2}}$, $R=E-AXB-C{{X}^{T}}D$, 其中$X\in \Re _{r}^{l\times l}(P)$, 则

(i) 存在${{X}^{*}}\in \Re _{r}^{l\times l}(P)$使得$\nabla F({{X}^{*}})=\mathbf{0}$, 且$\nabla F(X)=-2{\hbox{vec}}({{A}^{T}}R{{B}^{T}}+D{{R}^{T}}C)$.

(ii) ${{\nabla }^{2}}F(X)$为半正定矩阵.

 (i) 令${\hbox{vec}}({{X}^{T}})=Q(l,l){\hbox{vec}}(X)$, ${{A}_{1}}={{B}^{T}}\otimes A+({{D}^{T}}\otimes C)Q(l,l)$,

$F(X)={{\left\| E-AXB-C{{X}^{T}}D \right\|}^{2}}=\left\| {\hbox{vec}}(E-AXB-C{{X}^{T}}D) \right\|_{2}^{2}\\ = \left\| {\hbox{vec}}(E)-{\hbox{vec}}(AXB-C{{X}^{T}}D) \right\|_{2}^{2}\\ =\left\| {\hbox{vec}}(E)-[{{B}^{T}}\otimes A+({{D}^{T}}\otimes C)Q(l,l)]vec(X) \right\|_{2}^{2}\\ = \left\langle {\hbox{vec}}(E)-{{A}_{1}}{\hbox{vec}}(X), {\hbox{vec}}(E)-{{A}_{1}}{\hbox{vec}}(X) \right\rangle\\ = {{[{\hbox{vec}}(E)]}^{T}}{\hbox{vec}}(E)-2{{[{\hbox{vec}}(X)]}^{T}}{{A}_{1}}^{T}{\hbox{vec}}(E)+{{[{\hbox{vec}}(X)]}^{T}}{{A}_{1}}^{T}{{A}_{1}}{\hbox{vec}}(X), $

所以

$\nabla F(X)= 2{{A}_{1}}^{T}{{A}_{1}}{\hbox{vec}}(X)-2{{A}_{1}}^{T}{\hbox{vec}}(E) =2[{{A}_{1}}^{T}{{A}_{1}}{\hbox{vec}}(X)-{{A}_{1}}^{T}{\hbox{vec}}(E)]\\ =-2{\hbox{vec}}({{A}^{T}}R{{B}^{T}}+D{{R}^{T}}C).$

显然, 矩阵方程${{A}_{1}}^{T}{{A}_{1}}{\hbox{vec}}(X)-{{A}_{1}}^{T}{\hbox{vec}}(E)=\mathbf{0}$是相容的, 则存在${{X}^{*}}\in \Re _{r}^{l\times l}(P)$使得${{A}_{1}}^{T}{{A}_{1}}{\hbox{vec}}({{X}^{*}})-{{A}_{1}}^{T}{\hbox{vec}}(E)=\mathbf{0}$, 即$\nabla F({{X}^{*}})=\mathbf{0}$.

(ii) 因为$\nabla F(X)=2{{A}_{1}}^{T}{{A}_{1}}{\hbox{vec}}(X)-2{{A}_{1}}^{T}{\hbox{vec}}(E)$, 所以${{\nabla }^{2}}F(X)=2{{A}_{1}}^{T}{{A}_{1}}$.由于对于任意$Y\in {{\Re }^{{{l}^{2}}}}$都有${{\left\| {{A}_{1}}Y \right\|}^{2}}\ge 0$ $\Rightarrow \left\langle {{A}_{1}}Y,{{A}_{1}}Y \right\rangle \ge 0$$\Rightarrow {{Y}^{T}}{{A}_{1}}^{T}{{A}_{1}}Y\ge 0$, 所以${{\nabla }^{2}}F(X)$为半正定矩阵.

定理2[11] 设$S\subset {{\Re }^{n}}$为凸集, $f:S\subset {{\Re }^{n}}\to \Re $$S$的内点集$\operatorname{int}S$二次连续可微, 若${{x}^{*}}\in \operatorname{int}S$$f(x)$的驻点, 且$\forall x\in \operatorname{int}S$, ${{\nabla }^{2}}f(x)$半正定, 则${{x}^{*}}$$f(x)$$\operatorname{int}S$上的全局极小点.

定理3 若$\left\{ {{X}_{i}} \right\}$, $\left\{ {{g}_{i}} \right\}$$\left\{ {{d}_{i}} \right\}$是算法产生的序列, 则${{g}_{i}},\ {{d}_{i}},\ {{X}_{i}}\in \Re _{r}^{l\times l}(P)$ $\left( i=1,2,\cdots \right)$.

 下面用数学归纳法来证明${{g}_{i}},{{d}_{i}},{{X}_{i}}\in \Re _{r}^{l\times l}(P)$ $\left( i=1,2,\cdots \right)$.

$i=1$时, 所给的初始矩阵${{X}_{1}}\in \Re _{r}^{l\times l}(P)$, 显然${{g}_{1}}\in \Re _{r}^{l\times l}(P)$, ${{d}_{1}}\in \Re _{r}^{l\times l}(P)$.

假设当$i=k$ $\left( k\ge 2 \right)$时, ${{g}_{k}},{{d}_{k}},{{X}_{k}}\in \Re _{r}^{l\times l}(P)$结论成立.当$i=k+1$时,

$P{{g}_{k+1}}P=P[{{g}_{k}}-\frac{{\left\| {{g}_{k}} \right\|}^{2}}{{{\left\| {{d}_{k}} \right\|}^{2}}}({{U}_{k}}+P{{U}_{k}}P)]P=P{{g}_{k}}P-\frac{{\left\| {{g}_{k}} \right\|}^{2}}{{{\left\| {{d}_{k}} \right\|}^{2}}}(P{{U}_{k}}P+{{U}_{k}})={{g}_{k+1}}, \\ P{{d}_{k+1}}P=P({{V}_{k+1}}+P{{V}_{k+1}}P +\frac{{{\left\| {{g}_{k+1}} \right\|}^{2}}}{{{\left\| {{g}_{k}} \right\|}^{2}}}{{d}_{k}})P=P{{V}_{k+1}}P+{{V}_{k+1}} +\frac{{{\left\| {{g}_{k+1}} \right\|}^{2}}}{{{\left\| {{g}_{k}} \right\|}^{2}}}P{{d}_{k}}P={{d}_{k+1}}, \\ P{{X}_{k+1}}P=P\left( {{X}_{k}}+\frac{{{\left\| {{g}_{k}} \right\|}^{2}}}{{{\left\| {{d}_{k}} \right\|}^{2}}}{{d}_{k}} \right)P=P{{X}_{k}}P+\frac{{{\left\| {{g}_{k}} \right\|}^{2}}}{{{\left\| {{d}_{k}} \right\|}^{2}}}P{{d}_{k}}P={{X}_{k}}+\frac{{{\left\| {{g}_{k}} \right\|}^{2}}}{{{\left\| {{d}_{k}} \right\|}^{2}}}{{d}_{k}}={{X}_{k+1}},$

${{g}_{i}} ,{{d}_{i}},{{X}_{i}}\in \Re _{r}^{l\times l}(P)$ $\left( i=1,2,\cdots \right)$.

定理4 若${{X}^{*}}\in \Re _{r}^{l\times l}(P)$$\nabla F(X)=\mathbf{0}$的解, $\left\{ {{X}_{i}} \right\}$, $\left\{ {{g}_{i}} \right\}$$\left\{ {{d}_{i}} \right\}$是算法产生的序列.如果${{g}_{i}}\ne \mathbf{0}$, 则${{d}_{i}}\ne \mathbf{0}$ $\left( i=1,2,\cdots ,s \right)$.

 首先利用数学归纳法证明$\left\langle {{d}_{i}},{{X}^{*}}-{{X}_{i}} \right\rangle ={{\left\| {{g}_{i}} \right\|}^{2}}$ $\left( i=1,2,\cdots ,s \right)$, 当$i=1$时,

$\left\langle {{d}_{1}},{{X}^{*}}-{{X}_{1}} \right\rangle \\ = \left\langle {{V}_{1}}+P{{V}_{1}}P,{{X}^{*}}-{{X}_{1}} \right\rangle\\ =\left\langle {{A}^{T}}A{{g}_{1}}B{{B}^{T}}+{{A}^{T}}C{{g}_{1}}^{T}D{{B}^{T}}+D{{B}^{T}}{{g}_{1}}^{T}{{A}^{T}}C+D{{D}^{T}}{{g}_{1}}{{C}^{T}}C,{{X}^{*}}-{{X}_{1}} \right\rangle\\ + \left\langle {{A}^{T}}A{{g}_{1}}B{{B}^{T}}+{{A}^{T}}C{{g}_{1}}^{T}D{{B}^{T}}+D{{B}^{T}}{{g}_{1}}^{T}{{A}^{T}}C+D{{D}^{T}}{{g}_{1}}{{C}^{T}}C,P( {{X}^{*}}-{{X}_{1}} )P \right\rangle \\ = 2\left\langle {{A}^{T}}A{{g}_{1}}B{{B}^{T}}+{{A}^{T}}C{{g}_{1}}^{T}D{{B}^{T}}+D{{B}^{T}}{{g}_{1}}^{T}{{A}^{T}}C+D{{D}^{T}}{{g}_{1}}{{C}^{T}}C,{{X}^{*}}-{{X}_{1}} \right\rangle \\ = 2\left\langle {{A}^{T}}A{{g}_{1}}B{{B}^{T}}+D{{D}^{T}}{{g}_{1}}{{C}^{T}}C,{{X}^{*}}-{{X}_{1}} \right\rangle +2\left\langle {{A}^{T}}C{{g}_{1}}^{T}D{{B}^{T}}+D{{B}^{T}}{{g}_{1}}^{T}{{A}^{T}}C,{{X}^{*}}-{{X}_{1}} \right\rangle \\ = 2\left\langle {{A}^{T}}A{{g}_{1}}B{{B}^{T}}+D{{D}^{T}}{{g}_{1}}{{C}^{T}}C,{{X}^{*}}-{{X}_{1}} \right\rangle+2\langle B{{D}^{T}}{{g}_{1}}{{C}^{T}}A+{{C}^{T}}A{{g}_{1}}B{{D}^{T}},{{({{X}^{*}}-{{X}_{1}})}^{T}} \rangle\\ =2\langle {{g}_{1}},{{A}^{T}}A({{X}^{*}}-{{X}_{1}})B{{B}^{T}}+D{{D}^{T}}({{X}^{*}}-{{X}_{1}}){{C}^{T}}C \rangle\\ + 2\langle {{g}_{1}},D{{B}^{T}}{{({{X}^{*}}-{{X}_{1}})}^{T}}{{A}^{T}}C+{{A}^{T}}C{{({{X}^{*}}-{{X}_{1}})}^{T}}D{{B}^{T}} \rangle \\ = 2\left\langle {{g}_{1}},{{A}^{T}}A({{X}^{*}}-{{X}_{1}})B{{B}^{T}}+D{{D}^{T}}({{X}^{*}}-{{X}_{1}}){{C}^{T}}C \right.+D{{B}^{T}}{{({{X}^{*}}-{{X}_{1}})}^{T}}{{A}^{T}}C\\ +{{A}^{T}}C{{({{X}^{*}}-{{X}_{1}})}^{T}}D{{B}^{T}} \rangle\\ = \left\langle {{g}_{1}},2({{A}^{T}}{{R}_{1}}{{B}^{T}}+D{{R}_{1}}^{T}C) \right\rangle\\ = \left\langle {{g}_{1}},{{A}^{T}}{{R}_{1}}{{B}^{T}}+D{{R}_{1}}^{T}C \right\rangle+\left\langle {{g}_{1}},{{A}^{T}}{{R}_{1}}{{B}^{T}}+D{{R}_{1}}^{T}C \right\rangle\\ = \left\langle {{g}_{1}},{{A}^{T}}{{R}_{1}}{{B}^{T}}+D{{R}_{1}}^{T}C \right\rangle+\left\langle P{{g}_{1}}P,{{A}^{T}}{{R}_{1}}{{B}^{T}}+D{{R}_{1}}^{T}C \right\rangle \\ =\left\langle {{g}_{1}},{{A}^{T}}{{R}_{1}}{{B}^{T}}+D{{R}_{1}}^{T}C \right\rangle+\left\langle {{g}_{1}},P({{A}^{T}}{{R}_{1}}{{B}^{T}} +D{{R}_{1}}^{T}C)P\right\rangle\\ =\left\langle {{g}_{1}},{{A}^{T}}{{R}_{1}}{{B}^{T}}+D{{R}_{1}}^{T}C +P({{A}^{T}}{{R}_{1}}{{B}^{T}}+D{{R}_{1}}^{T}C)P\right\rangle \\ = {{\left\| {{g}_{1}} \right\|}^{2}}.$

假设当$i=k$ $(s>k\ge 2)$时, $\left\langle {{d}_{k}},{{X}^{*}}-{{X}_{k}} \right\rangle ={{\left\| {{g}_{k}} \right\|}^{2}}$成立.当$i=k+1$时,

$\left\langle {{d}_{k+1}},{{X}^{*}}-{{X}_{k+1}} \right\rangle =\langle{{V}_{k+1}}+P{{V}_{k+1}}P +\frac{{{\left\| {{g}_{k+1}} \right\|}^{2}}}{{{\left\| {{g}_{k}} \right\|}^{2}}}{{d}_{k}},{{X}^{*}}-{{X}_{k+1}} \rangle\\ =\left\langle{{V}_{k+1}},{{X}^{*}}-{{X}_{k+1}} \right\rangle+\left\langle{{V}_{k+1}},P({{X}^{*}}-{{X}_{k+1}})P \right\rangle+\frac{{{\left\| {{g}_{k+1}} \right\|}^{2}}}{{{\left\| {{g}_{k}} \right\|}^{2}}}\left\langle{{d}_{k}},{{X}^{*}}-{{X}_{k+1}} \right\rangle\\ = \left\langle 2({{A}^{T}}A{{g}_{k+1}}B{{B}^{T}}+{{A}^{T}}C{{g}_{k+1}}^{T}D{{B}^{T}}+D{{B}^{T}}{{g}_{k+1}}^{T}{{A}^{T}}C+D{{D}^{T}}{{g}_{k+1}}{{C}^{T}}C),{{X}^{*}}-{{X}_{k+1}} \right\rangle\\ + \frac{{{\left\| {{g}_{k+1}} \right\|}^{2}}}{{{\left\| {{g}_{k}} \right\|}^{2}}}\left\langle {{d}_{k}},{{X}^{*}}-{{X}_{k+1}} \right\rangle\\ = \left\langle {{g}_{k+1}},2({{A}^{T}}{{R}_{k+1}}{{B}^{T}}+D{{R}_{k+1}}^{T}C) \right\rangle +\frac{{{\left\| {{g}_{k+1}} \right\|}^{2}}}{{{\left\| {{g}_{k}} \right\|}^{2}}}\left\langle {{d}_{k}},{{X}^{*}}-{{X}_{k+1}} \right\rangle \\ =\left\langle {{g}_{k+1}},{{A}^{T}}{{R}_{k+1}}{{B}^{T}}+D{{R}_{k+1}}^{T}C+P({{A}^{T}}{{R}_{k+1}}{{B}^{T}}+D{{R}_{k+1}}^{T}C)P \right\rangle\\ +\frac{{{\left\| {{g}_{k+1}} \right\|}^{2}}}{{{\left\| {{g}_{k}} \right\|}^{2}}}\left\langle {{d}_{k}},{{X}^{*}}-{{X}_{k+1}} \right\rangle \\ ={{\left\| {{g}_{k+1}} \right\|}^{2}}+\frac{{{\left\| {{g}_{k+1}} \right\|}^{2}}}{{{\left\| {{g}_{k}} \right\|}^{2}}}\left\langle {{d}_{k}},{{X}^{*}}-{{X}_{k+1}} \right\rangle \\ = {{\left\| {{g}_{k+1}} \right\|}^{2}}+\frac{{{\left\| {{g}_{k+1}} \right\|}^{2}}}{{{\left\| {{g}_{k}} \right\|}^{2}}}\left\langle {{d}_{k}},{{X}^{*}}-{{X}_{k}}-\frac{{{\left\| {{g}_{k}} \right\|}^{2}}}{{{\left\| {{d}_{k}} \right\|}^{2}}}{{d}_{k}} \right\rangle \\ = {{\left\| {{g}_{k+1}} \right\|}^{2}}+\frac{{{\left\| {{g}_{k+1}} \right\|}^{2}}}{{{\left\| {{g}_{k}} \right\|}^{2}}}\left[ \left\langle {{d}_{k}},{{X}^{*}}-{{X}_{k}} \right\rangle -\frac{{{\left\| {{g}_{k}} \right\|}^{2}}}{{{\left\| {{d}_{k}} \right\|}^{2}}}\left\langle {{d}_{k}},{{d}_{k}} \right\rangle \right]\\ ={{\left\| {{g}_{k+1}} \right\|}^{2}}+\frac{{{\left\| {{g}_{k+1}} \right\|}^{2}}}{{{\left\| {{g}_{k}} \right\|}^{2}}}\left[ {{\left\| {{g}_{k}} \right\|}^{2}}-\frac{{{\left\| {{g}_{k}} \right\|}^{2}}}{{{\left\| {{d}_{k}} \right\|}^{2}}}{{\left\| {{d}_{k}} \right\|}^{2}} \right] ={{\left\| {{g}_{k+1}} \right\|}^{2}}.$

$\left\langle {{d}_{i}},{{X}^{*}}-{{X}_{i}} \right\rangle ={{\left\| {{g}_{i}} \right\|}^{2}}$成立, 所以当${{g}_{i}}\ne \mathbf{0}$时, ${{P}_{i}}\ne \mathbf{0}$ $(i=1,2,\cdots ,s)$.

定理5 若${{g}_{1}}$${{d}_{1}}$是算法产生的, ${\hbox{vec}}({{g}_{1}}^{T})=Q(l,l){\hbox{vec}}({{g}_{1}})=Q{\hbox{vec}}({{g}_{1}})$.如果${{g}_{1}}\ne \mathbf{0}$, 则$\nabla F{{({{X}_{1}})}^{T}}{\rm vec}({{d}_{1}})<0$.

$\nabla F{{({{X}_{1}})}^{T}}{\hbox{vec}}({{d}_{1}})=\left\langle-2{\hbox{vec}}({{A}^{T}}{{R}_{1}}{{B}^{T}}+D{{{R}_{1}}^{T}}C),{\hbox{vec}}({{d}_{1}}) \right\rangle\\ = -\left\langle 2{\hbox{vec}}({{A}^{T}}{{R}_{1}}{{B}^{T}}+D{{{R}_{1}}^{T}}C),{\hbox{vec}}(P{{d}_{1}}P) \right\rangle\\ = -\left\langle 2{\hbox{vec}}({{A}^{T}}{{R}_{1}}{{B}^{T}}+D{{{R}_{1}}^{T}}C),(P\otimes P){\hbox{vec}}({{d}_{1}}) \right\rangle \\ = -\left\langle {\hbox{vec}}[({{A}^{T}}{{R}_{1}}{{B}^{T}}+D{{{R}_{1}}^{T}}C)+P({{A}^{T}}{{R}_{1}}{{B}^{T}}+D{{{R}_{1}}^{T}}C)P],{\hbox{vec}}({{d}_{1}}) \right\rangle \\ =-\left\langle {\hbox{vec}}({{g}_{1}}),\left( I+P\otimes P \right){\hbox{vec}} ({{A}^{T}}A{{g}_{1}}B{{B}^{T}}+{{A}^{T}}C{{g}_{1}}^{T}D{{B}^{T}}+D{{B}^{T}}{{g}_{1}}^{T}{{A}^{T}}C+D{{D}^{T}}{{g}_{1}}{{C}^{T}}C)\right\rangle \\ =-{{\left[ {\hbox{vec}}({{g}_{1}}) \right]}^{T}}\left( I+P\otimes P \right){{\left[ {{B}^{T}}\otimes A+({{D}^{T}}\otimes C)Q \right]}^{T}}\left[ {{B}^{T}}\otimes A+({{D}^{T}}\otimes C)Q \right]{\hbox{vec}}({{g}_{1}})\\ =-{{\left\{ \left[ {{B}^{T}}\otimes A+({{D}^{T}}\otimes C)Q \right]\left( I+P\otimes P \right){\hbox{vec}}({{g}_{1}}) \right\}}^{T}}\left[ {{B}^{T}}\otimes A+({{D}^{T}}\otimes C)Q \right]{\hbox{vec}}({{g}_{1}})\\ =-{{\left\{ \left[ {{B}^{T}}\otimes A+({{D}^{T}}\otimes C)Q \right]2{\hbox{vec}}({{g}_{1}}) \right\}}^{T}}\left[ {{B}^{T}}\otimes A+({{D}^{T}}\otimes C)Q \right]{\hbox{vec}}({{g}_{1}})\\ =-2\left\| \left[ {{B}^{T}}\otimes A+({{D}^{T}}\otimes C)Q \right]{\hbox{vec}}({{g}_{1}}) \right\|_{2}^{2}\le 0.$

假设$\left\| \left[ {{B}^{T}}\otimes A+({{D}^{T}}\otimes C)Q \right]{\hbox{vec}}({{g}_{1}}) \right\|_{2}^{2}=0$, 则$\left[ {{B}^{T}}\otimes A+({{D}^{T}}\otimes C)Q \right]{\hbox{vec}}({{g}_{1}})=\mathbf{0}$, 从而可得

${\hbox{vec}}({d}_{1})= \left( I+P\otimes P \right){{\left[ {{B}^{T}}\otimes A+({{D}^{T}}\otimes C)Q \right]}^{T}}\left[ {{B}^{T}}\otimes A+({{D}^{T}}\otimes C)Q \right]{\hbox{vec}}({{g}_{1}})=\mathbf{0}.$

因为${{g}_{1}}\ne \mathbf{0}$, 根据定理4可得${{d}_{1}}\ne \mathbf{0}$, 这与${\hbox{vec}}({d}_{1})=\mathbf{0}$矛盾, 所以

$-\left\| \left[ {{B}^{T}}\otimes A+({{D}^{T}}\otimes C)Q \right]{\hbox{vec}}({{g}_{1}}) \right\|_{2}^{2}<0.$

$\nabla F{{({{X}_{1}})}^{T}}{\hbox{vec}}({{d}_{1}})<0. $

定理5表明了算法中的搜索方向${\hbox{vec}}({d}_{1})$是一个可行性下降方向.下面将证明所有的搜索方向相互共轭.

定理6$\left\{ {{g}_{i}} \right\}$$\left\{ {{d}_{i}} \right\}$是算法产生的序列, 且${{d}_{i}}\ne \mathbf{0}$ ($i=1,2,\cdots ,l$), 则

$\left\langle {{g}_{i}},{{g}_{j}} \right\rangle =0 ; \left\langle {{d}_{i}},{{d}_{j}} \right\rangle =0\left( i,j=1,2,\cdots ,l,i\ne j \right).$ (2.1)

 因为$\left\langle A,B \right\rangle =\left\langle B,A \right\rangle $, 其中$A,B\in {{\Re }^{m\times n}}$, 所以只需证明当$1\le i<j$时结论成立即可.用数学归纳法分两步来证明该结论:

第一步 证明$\left\langle {{g}_{i}},{{g}_{i+1}} \right\rangle =0$$\left\langle {{d}_{i}},{{d}_{i+1}} \right\rangle =0$ $(i=1,2,\cdots )$.当$i=1$时,

$\left\langle {{g}_{1}},{{g}_{2}} \right\rangle = \langle {{g}_{1}},{{g}_{1}}-\frac{{\left\| {{g}_{1}} \right\|}^{2}}{{{\left\| {{d}_{1}} \right\|}^{2}}}({U}_{1}+P{{U}_{1}}P)\rangle = \left\langle {{g}_{1}},{{g}_{1}}\right\rangle-\frac{{\left\| {{g}_{1}} \right\|}^{2}}{{{\left\| {{d}_{1}} \right\|}^{2}}}\left\langle {{g}_{1}},{U}_{1}+P{{U}_{1}}P\right\rangle\\ = {{\left\| {{g}_{1}} \right\|}^{2}}-\frac{{\left\| {{g}_{1}} \right\|}^{2}}{{{\left\| {{d}_{1}} \right\|}^{2}}}[\left\langle {{g}_{1}},{{U}_{1}}\right\rangle+\left\langle P{{g}_{1}}P,{{U}_{1}}\right\rangle] = {{\left\| {{g}_{1}} \right\|}^{2}}-\frac{{\left\| {{g}_{1}} \right\|}^{2}}{{{\left\| {{d}_{1}} \right\|}^{2}}}\left\langle {{g}_{1}},2{{U}_{1}}\right\rangle\\ ={{\left\| {{g}_{1}} \right\|}^{2}}-\frac{2{{\left\| {{g}_{1}} \right\|}^{2}}}{{{\left\| {{d}_{1}} \right\|}^{2}}}\left\langle {{g}_{1}},{{A}^{T}}A{{d}_{1}}B{{B}^{T}}+{{A}^{T}}C{{d}_{1}}^{T}D{{B}^{T}}+D{{B}^{T}}{{d}_{1}}^{T}{{A}^{T}}C+D{{D}^{T}}{{d}_{1}}{{C}^{T}}C \right\rangle \\ = {{\left\| {{g}_{1}} \right\|}^{2}}-\frac{2{{\left\| {{g}_{1}} \right\|}^{2}}}{{{\left\| {{d}_{1}} \right\|}^{2}}}\left\langle {{A}^{T}}A{{g}_{1}}B{{B}^{T}}+{{A}^{T}}C{{g}_{1}}^{T}D{{B}^{T}}+D{{B}^{T}}{{g}_{1}}^{T}{{A}^{T}}C+D{{D}^{T}}{{g}_{1}}{{C}^{T}}C,{{d}_{1}} \right\rangle \\ = {{\left\| {{g}_{1}} \right\|}^{2}}-\frac{{{\left\| {{g}_{1}} \right\|}^{2}}}{{{\left\| {{d}_{1}} \right\|}^{2}}}\left\langle {{V}_{1}},{{d}_{1}}+P{{d}_{1}}P \right\rangle = {{\left\| {{g}_{1}} \right\|}^{2}}-\frac{{{\left\| {{g}_{1}} \right\|}^{2}}}{{{\left\| {{d}_{1}} \right\|}^{2}}}\left\langle {{V}_{1}}+P{{V}_{1}}P,{{d}_{1}} \right\rangle \\ = {{\left\| {{g}_{1}} \right\|}^{2}}-\frac{{{\left\| {{g}_{1}} \right\|}^{2}}}{{{\left\| {{d}_{1}} \right\|}^{2}}}\left\langle {{d}_{1}},{{d}_{1}} \right\rangle =0,\\ \left\langle {{d}_{1}},{{d}_{2}} \right\rangle = \langle {{d}_{1}},{{V}_{2}} + P{{V}_{2}}P+\frac{{{\left\| {{g}_{2}} \right\|}^{2}}}{{{\left\| {{g}_{1}} \right\|}^{2}}}{{d}_{1}} \rangle = \left\langle {{d}_{1}},2{{V}_{2}}\right\rangle+ \frac{{{\left\| {{g}_{2}} \right\|}^{2}}}{{{\left\| {{g}_{1}} \right\|}^{2}}}\left\langle {{d}_{1}},{{d}_{1}} \right\rangle\\ = \left\langle {{d}_{1}},2({{A}^{T}}A{{g}_{2}}B{{B}^{T}}+{{A}^{T}}C{{g}_{2}}^{T}D{{B}^{T}}+D{{B}^{T}}{{g}_{2}}^{T}{{A}^{T}}C+D{{D}^{T}}{{g}_{2}}{{C}^{T}}C) \right\rangle +\frac{{{\left\| {{g}_{2}} \right\|}^{2}}}{{{\left\| {{g}_{1}} \right\|}^{2}}}\left\langle {{d}_{1}},{{d}_{1}} \right\rangle \\ = \left\langle 2({{A}^{T}}A{{d}_{1}}B{{B}^{T}}+{{A}^{T}}C{{d}_{1}}^{T}D{{B}^{T}}+D{{B}^{T}}{{d}_{1}}^{T}{{A}^{T}}C+D{{D}^{T}}{{d}_{1}}{{C}^{T}}C),{{g}_{2}} \right\rangle +\frac{{{\left\| {{g}_{2}} \right\|}^{2}}}{{{\left\| {{g}_{1}} \right\|}^{2}}}\left\langle {{d}_{1}},{{d}_{1}} \right\rangle \\ = \left\langle 2{{U}_{1}},{{g}_{2}} \right\rangle +\frac{{{\left\| {{g}_{2}} \right\|}^{2}}}{{{\left\| {{g}_{1}} \right\|}^{2}}}\left\langle {{d}_{1}},{{d}_{1}} \right\rangle = \left\langle {U}_{1}+P{{U}_{1}}P,{{g}_{2}} \right\rangle +\frac{{{\left\| {{g}_{2}} \right\|}^{2}}}{{{\left\| {{g}_{1}} \right\|}^{2}}}\left\langle {{d}_{1}},{{d}_{1}} \right\rangle \\ = \left\langle \frac{{{\left\| {{d}_{1}} \right\|}^{2}}}{{{\left\| {{g}_{1}} \right\|}^{2}}}\left( {{g}_{1}}-{{g}_{2}} \right),{{g}_{2}} \right\rangle +\frac{{{\left\| {{g}_{2}} \right\|}^{2}}}{{{\left\| {{g}_{1}} \right\|}^{2}}}{{\left\| {{d}_{1}} \right\|}^{2}}\\ = \frac{{{\left\| {{d}_{1}} \right\|}^{2}}}{{{\left\| {{g}_{1}} \right\|}^{2}}}\left\langle {{g}_{1}}-{{g}_{2}},{{g}_{2}} \right\rangle +\frac{{{\left\| {{g}_{2}} \right\|}^{2}}}{{{\left\| {{g}_{1}} \right\|}^{2}}}{{\left\| {{d}_{1}} \right\|}^{2}}=0.$

假设当$i\le k-1 (s>k\ge 2)$结论成立.当$i=k$时,

$\left\langle {{g}_{k}},{{g}_{k+1}} \right\rangle = \left\langle {{g}_{k}},{{g}_{k}}-\frac{{\left\| {{g}_{k}} \right\|}^{2}}{{{\left\| {{d}_{k}} \right\|}^{2}}}({U}_{k}+P{{U}_{k}}P) \right\rangle ={{\left\| {{g}_{k}} \right\|}^{2}}-\frac{{\left\| {{g}_{k}} \right\|}^{2}}{{{\left\| {{d}_{k}} \right\|}^{2}}}\left\langle {g}_{k},2{U}_{k} \right\rangle \\ = {{\left\| {{g}_{k}} \right\|}^{2}}-\frac{{\left\| {{g}_{k}} \right\|}^{2}}{{{\left\| {{d}_{k}} \right\|}^{2}}}\left\langle {{A}^{T}}A{{g}_{k}}B{{B}^{T}}+D{{B}^{T}}{{g}_{k}}^{T}{{A}^{T}}C+{{A}^{T}}C{{g}_{k}}^{T}D{{B}^{T}}+D{{D}^{T}}{{g}_{k}}{{C}^{T}}C,2{{d}_{k}} \right\rangle\\ ={{\left\| {{g}_{k}} \right\|}^{2}}-\frac{{\left\| {{g}_{k}} \right\|}^{2}}{{{\left\| {{d}_{k}} \right\|}^{2}}}\left\langle {{V}_{k}},2{{d}_{k}} \right\rangle ={{\left\| {{g}_{k}} \right\|}^{2}}-\frac{{{\left\| {{g}_{k}} \right\|}^{2}}}{{{\left\| {{d}_{k}} \right\|}^{2}}}\left\langle {{V}_{k}},{{d}_{k}}+P{{d}_{k}}P \right\rangle \\ = {{\left\| {{g}_{k}} \right\|}^{2}}-\frac{{{\left\| {{g}_{k}} \right\|}^{2}}}{{{\left\| {{d}_{k}} \right\|}^{2}}}\left\langle {{V}_{k}}+P{{V}_{k}}P ,{{d}_{k}}\right\rangle \\ = {{\left\| {{g}_{k}} \right\|}^{2}}-\frac{{{\left\| {{g}_{k}} \right\|}^{2}}}{{{\left\| {{d}_{k}} \right\|}^{2}}}\langle {{d}_{k}}-\frac{{{\left\| {{g}_{k}} \right\|}^{2}}}{{{\left\| {{g}_{k-1}} \right\|}^{2}}}{{d}_{k-1}},{{d}_{k}} \rangle\\ = {{\left\| {{g}_{k}} \right\|}^{2}}-\frac{{{\left\| {{g}_{k}} \right\|}^{2}}}{{{\left\| {{d}_{k}} \right\|}^{2}}}( {{\left\| {{d}_{k}} \right\|}^{2}}-\frac{{{\left\| {{g}_{k}} \right\|}^{2}}}{{{\left\| {{g}_{k-1}} \right\|}^{2}}}\left\langle {{d}_{k-1}},{{d}_{k}} \right\rangle )=0, \\ \langle {{d}_{k}},{{d}_{k+1}} \rangle =\langle {{d}_{k}},{V}_{k+1}+P{{V}_{k+1}}P+\frac{{{\left\| {{g}_{k+1}} \right\|}^{2}}}{{{\left\| {{g}_{k}} \right\|}^{2}}}{{d}_{k}} \rangle =\langle {{d}_{k}},2{V}_{k+1}\rangle+\frac{{{\left\| {{g}_{k+1}} \right\|}^{2}}}{{{\left\| {{g}_{k}} \right\|}^{2}}}{{\left\| {{d}_{k}} \right\|}^{2}}\\ =\left\langle {{d}_{k}},2({{A}^{T}}A{{g}_{k+1}}B{{B}^{T}}+{{A}^{T}}C{{g}_{k+1}}^{T}D{{B}^{T}}+D{{B}^{T}}{{g}_{k+1}}^{T}{{A}^{T}}C+D{{D}^{T}}{{g}_{k+1}}{{C}^{T}}C) \right\rangle\\ +\frac{{{\left\| {{g}_{k+1}} \right\|}^{2}}}{{{\left\| {{g}_{k}} \right\|}^{2}}}{{\left\| {{d}_{k}} \right\|}^{2}}\\ = \left\langle 2({{A}^{T}}A{{d}_{k}}B{{B}^{T}}+{{A}^{T}}C{{d}_{k}}^{T}D{{B}^{T}}+D{{B}^{T}}{{d}_{k}}^{T}{{A}^{T}}C+D{{D}^{T}}{{d}_{k}}{{C}^{T}}C),{{g}_{k+1}} \right\rangle\\ +\frac{{{\left\| {{g}_{k+1}} \right\|}^{2}}}{{{\left\| {{g}_{k}} \right\|}^{2}}}{{\left\| {{d}_{k}} \right\|}^{2}}= \left\langle 2{{U}_{k}},{g}_{k+1} \right\rangle +\frac{{{\left\| {{g}_{k+1}} \right\|}^{2}}}{{{\left\| {{g}_{k}} \right\|}^{2}}}{{\left\| {{d}_{k}} \right\|}^{2}}\\ = \left\langle {{U}_{k}}+P{{U}_{k}}P,{g}_{k+1} \right\rangle +\frac{{{\left\| {{g}_{k+1}} \right\|}^{2}}}{{{\left\| {{g}_{k}} \right\|}^{2}}}{{\left\| {{d}_{k}} \right\|}^{2}}\\ = \frac{{{\left\| {{d}_{k}} \right\|}^{2}}}{{{\left\| {{g}_{k}} \right\|}^{2}}}\left\langle {{g}_{k}}-{{g}_{k+1}},{{g}_{k+1}} \right\rangle +\frac{{{\left\| {{g}_{k+1}} \right\|}^{2}}}{{{\left\| {{g}_{k}} \right\|}^{2}}}{{\left\| {{d}_{k}} \right\|}^{2}}=0,$

$\left\langle {{g}_{i}},{{g}_{i+1}} \right\rangle =0$$\left\langle {{d}_{i}},{{d}_{i+1}} \right\rangle =0$ $(i=1,2,\cdots )$成立.

第二步 根据第一步结论, 假设对于任意$i$$k$都有$\left\langle {{g}_{i}},{{g}_{i+k}} \right\rangle =0$$\left\langle {{d}_{i}},{{d}_{i+k}} \right\rangle =0$成立.

下面证明$\left\langle {{g}_{i}},{{g}_{i+k+1}} \right\rangle =0$$\left\langle {{d}_{i}},{{d}_{i+k+1}} \right\rangle =0$成立.

$\left\langle {{g}_{i}},{{g}_{i+k+1}} \right\rangle = \langle {{g}_{i}},{{g}_{i+k}}-\frac{{\left\| {{g}_{i+k}} \right\|}^{2}}{{{\left\| {{d}_{i+k}} \right\|}^{2}}}({{U}_{i+k}}+P{{U}_{i+k}}P) \rangle =-\frac{{\left\| {{g}_{i+k}} \right\|}^{2}}{{{\left\| {{d}_{i+k}} \right\|}^{2}}}\left\langle {{g}_{i}},2{{U}_{i+k}}\right\rangle\\ = -\frac{2{{\left\| {{g}_{i+k}} \right\|}^{2}}}{{{\left\| {{d}_{i+k}} \right\|}^{2}}}\left\langle {{g}_{i}},{{A}^{T}}A{{d}_{i+k}}B{{B}^{T}}+{{A}^{T}}C{{d}_{i+k}}^{T}D{{B}^{T}}+D{{B}^{T}}{{d}_{i+k}}^{T}{{A}^{T}}C+D{{D}^{T}}{{d}_{i+k}}{{C}^{T}}C \right\rangle \\ = -\frac{2{{\left\| {{g}_{i+k}} \right\|}^{2}}}{{{\left\| {{d}_{i+k}} \right\|}^{2}}}\left\langle {{A}^{T}}A{{g}_{i}}B{{B}^{T}}+{{A}^{T}}C{{g}_{i}}^{T}D{{B}^{T}}+D{{B}^{T}}{{g}_{i}}^{T}{{A}^{T}}C+D{{D}^{T}}{{g}_{i}}{{C}^{T}}C,{{d}_{i+k}} \right\rangle \\ = -\frac{2{{\left\| {{g}_{i+k}} \right\|}^{2}}}{{{\left\| {{d}_{i+k}} \right\|}^{2}}}\left\langle {{V}_{i}},{{d}_{i+k}} \right\rangle =-\frac{{{\left\| {{g}_{i+k}} \right\|}^{2}}}{{{\left\| {{d}_{i+k}} \right\|}^{2}}}\left\langle {{V}_{i}},{{d}_{i+k}}+P{{d}_{i+k}}P \right\rangle\\ = -\frac{{{\left\| {{g}_{i+k}} \right\|}^{2}}}{{{\left\| {{d}_{i+k}} \right\|}^{2}}}\left\langle {{V}_{i}}+P{{V}_{i}}P,{{d}_{i+k}} \right\rangle = -\frac{{{\left\| {{g}_{i+k}} \right\|}^{2}}}{{{\left\| {{d}_{i+k}} \right\|}^{2}}}\langle {{d}_{i}}-\frac{{{\left\| {{g}_{i}} \right\|}^{2}}}{{{\left\| {{g}_{i-1}} \right\|}^{2}}}{{d}_{i-1}},{{d}_{i+k}} \rangle =0,\\ \langle {{d}_{i}},{{d}_{i+k+1}} \rangle = \langle {{d}_{i}},{{V}_{i+k+1}}+P{{V}_{i+k+1}}P +\frac{{{\left\| {{g}_{i+k+1}} \right\|}^{2}}}{{{\left\| {{g}_{i+k}} \right\|}^{2}}}{{d}_{i+k}} \rangle \\ = \left\langle {{d}_{i}},2{{V}_{i+k+1}} \right\rangle +\frac{{{\left\| {{g}_{i+k+1}} \right\|}^{2}}}{{{\left\| {{g}_{i+k}} \right\|}^{2}}}\left\langle {{d}_{i}},{{d}_{i+k}} \right\rangle \\ = \left\langle {{d}_{i}},2({{A}^{T}}A{{g}_{i+k+1}}B{{B}^{T}}+{{A}^{T}}C{{g}_{i+k+1}}^{T}D{{B}^{T}}+D{{B}^{T}}{{g}_{i+k+1}}^{T}{{A}^{T}}C+D{{D}^{T}}{{g}_{i+k+1}}{{C}^{T}}C) \right\rangle \\ = \left\langle 2({{A}^{T}}A{{d}_{i}}B{{B}^{T}}+{{A}^{T}}C{{d}_{i}}^{T}D{{B}^{T}}+D{{B}^{T}}{{d}_{i}}^{T}{{A}^{T}}C+D{{D}^{T}}{{d}_{i}}{{C}^{T}}C) ,{{g}_{i+k+1}}\right\rangle \\ = \left\langle 2{{U}_{i}},{{g}_{i+k+1}}\right\rangle =\left\langle {{U}_{i}}+P{{U}_{i}}P,{{g}_{i+k+1}}\right\rangle = \frac{{{\left\| {{d}_{i}} \right\|}^{2}}}{{{\left\| {{g}_{i}} \right\|}^{2}}}\left\langle {{g}_{i}}-{{g}_{i+1}},{{g}_{i+k+1}} \right\rangle=0.$

由第一步和第二步可得$\left\langle {{g}_{i}},{{g}_{j}} \right\rangle =0$$\left\langle {{d}_{i}},{{d}_{j}} \right\rangle =0$ $\left( i,j=1,2,\cdots ,i\ne j \right)$成立.

定理7 若$\forall {{X}_{1}}\in \Re _{r}^{l\times l}\left( P \right) $, 其中$P\in \Re _{r}^{l\times l}$是广义自反矩阵, 算法能在有限迭代步内得到问题I的自反解.

 假设${{g}_{i}}\ne \mathbf{0}$$\left( i=1,2,\cdots ,{{l}^{2}} \right)$, 由定理4可知, ${{d}_{i}}\ne \mathbf{0}$ $\left( i=1,2,\cdots ,{{l}^{2}} \right)$, 因此根据算法能够计算出${{X}_{{{l}^{2}}+1}}$${{g}_{{{l}^{2}}+1}}$.由定理6可得$\left\langle {{g}_{i}},{{g}_{{{l}^{2}}+1}} \right\rangle =0$ $\left( i=1,2,\cdots ,{{l}^{2}} \right)$$\left\langle {{g}_{i}},{{g}_{j}} \right\rangle =0$ $\left( i,j=1,2,\cdots ,{{l}^{2}},i\ne j \right)$.因为$\left\{ {{g}_{1}},{{g}_{2}},\cdots ,{{g}_{{{l}^{2}}}} \right\}$是矩阵空间$\Re _{r}^{{{l}^{2}}}$的正交基, 所以${{g}_{{{l}^{2}}+1}}=\mathbf{0}$, 即可推导出$\nabla F({{X}_{{{l}^{2}}+1}} )=\mathbf{0}$.由定理1--定理3可得, ${{X}_{{{l}^{2}}+1}}$为问题I的自反解.

定理8[12] 假设最小剩余问题${{\left\| Mx-b \right\|}_{2}}=\text{min}$有解${{x}^{*}}\in R\left( {{M}^{T}} \right)$, 则${{x}^{*}}$是该剩余问题的极小范数最小二乘解.

定理9 在所给算法中, 如果取初始矩阵${{X}_{1}}=\frac{1}{2}({{A}^{T}}N{{B}^{T}}+D{{N}^{T}}C)+\frac{1}{2}P({{A}^{T}}N{{B}^{T}}+D{{N}^{T}}C)P$, 其中$N\in {{\Re }^{m\times n}}$, 特别取${{X}_{1}}=\mathbf{0}$, 算法能够在有限迭代步内获得问题Ⅰ的极小范数最小二乘自反解.

 如果取初始矩阵${{X}_{1}}=\frac{1}{2}({{A}^{T}}N{{B}^{T}}+D{{N}^{T}}C)+\frac{1}{2}P({{A}^{T}}N{{B}^{T}}+D{{N}^{T}}C)P$, 由定理7可知, 算法能在有限迭代步内得到问题I的自反解${{X}^{*}}$, 且${{X}^{*}}$能表示成

${{X}^{*}}=\frac{1}{2}({{A}^{T}}\tilde{N}{{B}^{T}}+D{{\tilde{N}}^{T}}C)+\frac{1}{2}P({{A}^{T}}\tilde{N}{{B}^{T}}+D{{\tilde{N}}^{T}}C)P.$

${\hbox{vec}}({X}^{T})=Q{\hbox{vec}}({X})$, 其中$X\in {{\Re }^{l\times l}}$.

下面将证明${{X}^{*}}$是问题Ⅰ的极小范数最小二乘自反解:

$\mathop {\min }\limits_{X \in \Re _r^{l \times l}(P)} \,\left\| E-\left( AXB+C{{X}^{T}}D \right) \right\|\\ = \mathop {\min }\limits_{X \in \Re _r^{l \times l}(P)} \,\left\| E-\frac{1}{2}\left( AXB+C{{X}^{T}}D \right)-\frac{1}{2}\left( APXPB+CP{{X}^{T}}PD \right) \right\|\\ =\mathop {\min }\limits_{X \in \Re _r^{l \times l}(P)} \,{{\left\| {\hbox{vec}}(E)-\frac{1}{2}vec\left[ \left( AXB+C{{X}^{T}}D \right)+\left( APXPB+CP{{X}^{T}}PD \right) \right] \right\|}_{2}}\\ = \mathop {\min }\limits_{X \in \Re _r^{l \times l}(P)} \,\left\| {\hbox{vec}}(E)-\frac{1}{2}\left\{ {{B}^{T}}\otimes A+({{D}^{T}}\otimes C)Q+({{B}^{T}}P)\otimes (AP)+[({{D}^{T}}P)\otimes (CP)]Q \right\}\right. \\ \left. {\hbox{vec}}(X) \right\|_{2}.$

$\tilde{N}\in {{\Re }^{m\times n}}$, 则

${\hbox{vec}}( {{X}^{*}} )={\hbox{vec}}[\frac{1}{2}({{A}^{T}}\tilde{N}{{B}^{T}}+D{{{\tilde{N}}}^{T}}C)+\frac{1}{2}P({{A}^{T}}\tilde{N}{{B}^{T}}+D{{{\tilde{N}}}^{T}}C)P]\\ = \frac{1}{2}{{\left\{ {{B}^{T}}\otimes A+({{D}^{T}}\otimes C)Q+({{B}^{T}}P)\otimes (AP)+[({{D}^{T}}P)\otimes (CP)]Q \right\}}^{T}}{\hbox{vec}}( {\tilde{N}})\\ \in \Re ( \frac{1}{2}{{\left\{ {{B}^{T}}\otimes A+({{D}^{T}}\otimes C)Q+({{B}^{T}}P)\otimes (AP)+[({{D}^{T}}P)\otimes (CP)]Q \right\}}^{T}} ).$

故当取初始矩阵${{X}_{1}}=\frac{1}{2}({{A}^{T}}N{{B}^{T}}+D{{N}^{T}}C)+\frac{1}{2}P({{A}^{T}}N{{B}^{T}}+D{{N}^{T}}C)P$时, 特别${{X}_{1}}=\mathbf{0}$, 由定理8可得, 算法获得的自反解${{X}^{*}}$就是自反的极小范数最小二乘解.

若问题I的解集${{S}_{X}}$为非空集, $\bar{X}$是所给的逼近矩阵, 其中$\bar{X}\in \Re _{r}^{l\times l}(P)$, 则

$\mathop {\min }\limits_{X \in \Re _r^{l \times l}(P)} \,\left\| E-AXB-C{{X}^{T}}D \right\|\\ = \mathop {\min }\limits_{X \in \Re _r^{l \times l}(P)} \,\left\|( E-A\bar{X}B-C{{{\bar{X}}}^{T}}D )-A( X-\bar{X} )B-C{{( X-\bar{X} )}^{T}}D \right\|.$

$\tilde{X}=X-\bar{X}$$\tilde{E}=E-A\bar{X}B-C{{\bar{X}}^{T}}D$, 那么最佳逼近问题Ⅱ就等价于先求 (2.2) 式的自反极小范数最小二乘解.

$\mathop {\min }\limits_{X \in \Re _r^{l \times l}(P)} \,\left\| \tilde{E}-A\tilde{X}B-C{{{\tilde{X}}}^{T}}D \right\|.$ (2.2)

由算法可得(2.2)式的自反极小范数最小二乘解${{\tilde{X}}^{*}}$, 从而计算出问题Ⅰ和Ⅱ的自反最佳逼近解$\hat{X}={{\tilde{X}}^{*}}+\bar{X}$.

3 数值例子

本节用三个数值例子来验证上述算法的可行性.第一个例子是当矩阵方程$AXB+C{{X}^{T}}D=E$有自反解且逼近矩阵为零矩阵时, 求该方程的自反最佳逼近解; 第二个例子是当矩阵方程$AXB+C{{X}^{T}}D=E$不相容且逼近矩阵为零矩阵时, 求该方程的自反最佳逼近解; 最后一个例子是当矩阵方程$AXB+C{{X}^{T}}D=E$有自反解且逼近矩阵为非零矩阵时, 求$AXB+C{{X}^{T}}D=E$的自反最佳逼近解.用Matlab2007R进行仿真模拟, 取初始自反矩阵${{X}_{1}}=\mathbf{0}$.

例1 考虑下面最小剩余问题:

$\left\| E-AXB-C{{X}^{T}}D \right\|=\min,$ (3.1)

其中

$A=\left( \begin{matrix} 2&1&6&3&-4 \\ 5&4&-3&3&-6 \\ -1&4&8&-7&2 \\ 5&-2&-6&9&4 \end{matrix} \right), ~ ~ B=\left( \begin{matrix} 5&2&-6&-4&5 \\ -7&8&1&3&-5 \\ 2&-9&8&-1&-2 \\ 2&4&-3&-7&11 \\ 4&6&-2&-12&-4 \end{matrix} \right), \\ C=\left( \begin{matrix} 4&-2&9&-7&11 \\ -6&7&5&8&-3 \\ -13&2&4&-5&1 \\ 8&-6&2&6&-2 \end{matrix} \right), D=\left( \begin{matrix} -3&-2&7&3&-1 \\ -6&1&-2&5&-2 \\ 4&3&1&-3&9 \\ -5&-3&2&4&6 \\ 2&3&-6&11&-11 \end{matrix} \right), \bar{X}=\left( \begin{matrix} 0&0&0&0&0 \\ 0&0&0&0&0 \\ 0&0&0&0&0 \\ 0&0&0&0&0 \\ 0&0&0&0&0 \\ \end{matrix} \right), \\ E=\left( \begin{matrix} \text{-2064}&\text{-1543 }&\text{ 1510 }&\text{838}&\text{-195} \\ \text{261}&\text{-271}&\text{227}&\text{-742}&\text{304} \\ \text{-119}&\text{-524}&\text{720}&\text{-1683}&\text{4651} \\ \text{-563}&\text{1059}&\text{-773}&\text{796}&\text{-3000} \end{matrix} \right), P=\left( \begin{matrix} 0&0&-1&0&0 \\ 0&-1&0&0&0 \\ -1&0&0&0&0 \\ 0&0&0&0&-1 \\ 0&0&0&-1&0 \\ \end{matrix} \right).$

可以验证下面的$X$就是矩阵方程$AXB+C{{X}^{T}}D=E$有自反解,

$ X=\left( \begin{matrix} 1&3&-4&-8&-2 \\ 2&-5&2&12&12 \\ -4&3&1&-2&-8 \\ -6&7&9&-3&4 \\ 9&7&-6&4&-3 \\ \end{matrix} \right). $

利用所给的算法, 迭代29步得到

$ {{X}_{29}}=\left( \begin{matrix} 1.0000&3.0000&-4.0000&-8.0000&-2.0000 \\ 2.0000&-5.0000&2.0000&12.0000&12.0000 \\ -4.0000&3.0000&1.0000&-2.0000&-8.0000 \\ -6.0000&7.0000&9.0000&-3.0000&4.0000 \\ 9.0000&7.0000&-6.0000&4.0000&-3.0000 \\ \end{matrix} \right)\in \Re _{r}^{5\times 5}(P),$

相应的余项

$ {{R}_{29}}=\left\|E-A{{X}_{29}}B-C{{X}_{29}}^{T}D \right\|=\text{4.2299e-012}, $

相对误差

${{\delta }_{29}}=\left\| {{X}_{29}}-X \right\|/\left\| X \right\|=\text{7.8262e-015},$

其中下标是迭代步数.

例2 仍考虑 (3.1) 式的最小剩余问题, $A, B, C, D, P$$\bar{X}$同例1,

$ E=\left( \begin{matrix} \text{-2060}&\text{-1543 }&\text{ 1510 }&\text{838}&\text{-195} \\ \text{261}&\text{-271}&\text{227}&\text{-742}&\text{304} \\ \text{-119}&\text{-524}&\text{720}&\text{-1683}&\text{4651} \\ \text{-563}&\text{1059}&\text{-773}&\text{796}&\text{-3000} \end{matrix} \right).$

可以验证该方程$AXB+C{{X}^{T}}D=E$不相容, 利用算法得到

$ {{X}_{21}}=\left( \begin{matrix} \text{1}\text{.0009}&\text{3}\text{.0041}&\text{-3}\text{.9952 }&\text{-8}\text{.0070}&\text{-2}\text{.0278} \\ \text{1}\text{.9442}&\text{-5}\text{.0596}&\text{1}\text{.9442 }&\text{12}\text{.0414}&\text{12}\text{.0414} \\ \text{ -3}\text{.9952}&\text{3}\text{.0041}&\text{1}\text{.0009}&\text{-2}\text{.0278}&\text{-8}\text{.0070} \\ \text{-5}\text{.9965}&\text{7}\text{.0020}&\text{9}\text{.0038}&\text{-2}\text{.9887}&\text{4}\text{.0117} \\ \text{9}\text{.0038}&\text{7}\text{.0020}&\text{-5}\text{.9965}&\text{4}\text{.0117}&\text{-2}\text{.9887} \\ \end{matrix} \right)\in \Re _{r}^{5\times 5}(P),$

相应的余项

$ {{R}_{21}}=\left\| E-A{{X}_{21}}B-C{{X}_{21}}^{T}D \right\|=\text{2}\text{.0560}.$

例3 仍考虑 (3.1) 式的最小剩余问题, $A, B, C, D, E$$P$同例1,

$ \bar{X}=\left( \begin{matrix} 10&10&10&10&10 \\ 10&10&10&10&10 \\ 10&10&10&10&10 \\ 10&10&10&10&10 \\ 10&10&10&10&10 \\ \end{matrix} \right)\in \Re _{r}^{5\times 5}(P). $

利用算法, 迭代37步得到

$ {{\tilde{X}}_{37}}=\left( \begin{matrix} \text{-9}\text{.0000}&\text{-7}\text{.0000}&\text{-14}\text{.0000}&\text{-18}\text{.0000}&\text{-12}\text{.0000} \\ \text{-8}\text{.0000}&\text{-15}\text{.0000}&\text{-8}\text{.0000}&\text{2}\text{.0000}&\text{2}\text{.0000} \\ \text{-14}\text{.0000}&\text{-7}\text{.0000}&\text{-9}\text{.0000}&\text{-12}\text{.0000}&\text{-18}\text{.0000} \\ \text{-16}\text{.0000 }&\text{-3}\text{.0000}&\text{-1}\text{.0000}&\text{-13}\text{.0000}&\text{-6}\text{.0000} \\ \text{ -1}\text{.0000}&\text{-3}\text{.0000}&\text{-16}\text{.0000}&\text{-6}\text{.0000}&\text{-13}\text{.0000} \\ \end{matrix} \right)\in \Re _{r}^{5\times 5}(P),$

所以自反最佳逼近解为

$ {{\hat{X}}_{37}}={{\tilde{X}}_{37}}+\bar{X}=\left( \begin{matrix} 1.0000&3.0000&-4.0000&-8.0000&-2.0000 \\ 2.0000&-5.0000&2.0000&12.0000&12.0000 \\ -4.0000&3.0000&1.0000&-2.0000&-8.0000 \\ -6.0000&7.0000&9.0000&-3.0000&4.0000 \\ 9.0000&7.0000&-6.0000&4.0000&-3.0000 \\ \end{matrix} \right)\in \Re _{r}^{5\times 5}(P),$

相应的余项

$ {{R}_{37}}=\left\|E-A{{{\hat{X}}}_{37}}B-C{{{\hat{X}}}_{37}}^{T}D \right\|=\text{3.4050e-012}.$
4 结论

本文利用共轭方向法思想, 提出了求矩阵方程$AXB+C{{X}^{T}}D=E$自反最佳逼近解的一个迭代算法.无论$AXB+C{{X}^{T}}D=E$是否相容, 任取一个初始自反矩阵${{X}_{1}}$, 所给的算法都能够在有限迭代步内获得其自反最佳逼近解.三个数值例子的结果表明该算法是可行性的.

参考文献
[1] Ch en, Hsin-Chu. Generalized reflexive matrices: special properties and applications[J]. SIAM J. Matrix Anal. Appl., 1998(19): 140–153.
[2] Chen Hsin-Chu. The SAS domain decomposition method for structural analysis[R]. CSRD Teach., Report 754, Center for Supercomputing Research and Development, Urbana, IL: University of Illinois, 1988.
[3] Piao Fengxian, Zhang Qingling, Wang Zhefeng. The solution to matrix equationAX +XT C=B[J]. J. Franklin Institute, 2007, 344: 1056–1062. DOI:10.1016/j.jfranklin.2007.05.002
[4] Li Xie, Liu Yanjun, Yang Huizhong. Gradient based and least squares based iterative algorithms for matrix equation AXB + CXT D=E[J]. Appl. Math. Comput., 2010, 217: 2191–2199.
[5] Wang Minghui, Cheng Xuehan, Wei Musheng. Iterative algorithms for solving the matrix equationAXB + CXT D=E[J]. Appl. Math. Comput., 2007, 187(2): 622–629.
[6] Teran F, Dopico F. The solution of the equation XA + AXT=0 and its applications to the theory of orbits[J]. Linear Algebra Appl., 2011, 434: 44–67. DOI:10.1016/j.laa.2010.08.005
[7] 赵琳琳. 矩阵方程AXB + CXT D=E的可解性[J]. 山东大学学报, 2012, 47(10): 45–48.
[8] Mehdi Dehghan, Masoud Hajarian. Finite iterative algorithms for the reflexive and anti-reflexive solutions of the matrix equations A1X1B1=C; A2X2B2=D[J]. Math. Computer Modeling, 2009, 49: 1937–1959. DOI:10.1016/j.mcm.2008.12.014
[9] 盛兴平, 苏友峰, 陈果良. 矩阵方程AT XB + BT XT A=D的极小范数最小二乘解的迭代算法[J]. 高等学校计算数学学报, 2008, 30(4): 352–362.
[10] 孙合明, 李庆芳, 杨家稳. 自反矩阵下矩阵方程AXB + CXD=E的最佳逼近解[J]. 重庆理工大学学报, 2012, 26(4): 109–114.
[11] 张光澄, 王文娟, 韩会磊, 等. 非线性最优化计算方法[M]. 北京: 高等教育出版社, 2006.
[12] Peng Zhuohua, Hu Xiyan, Zhang Lei. An e-cient algorithm for the least-squares reflective solution of the matrix equationsA1XB1=C1; A2XB2=C2[J]. Appl. Math. Comput., 2006, 181: 988–999.