Sylvester equations play an important role in many fields of scientific computing and engineering applications, such as control theory, linear system theory, numerical solution of differential equations, optimization theory, signal processing, statistics and so on (see [1-4]). In this paper, we consider Sylvester equation of the form
where $ A\in\mathbb{R}^{m\times m} $, $ B\in\mathbb{R}^{n\times n} $ are M-matrices with at least one of them be nonsingular, and $ C\in\mathbb{R}^{m\times n} $ is a nonnegative matrix. This kind of Sylvester equation is called M-matrix Sylvester equation (see [5]). Under the above conditions, equation (1.1) has a unique nonnegative matrix solution. M-matrix Sylvester equation appears frequently in iterative methods for the study of nonsymmetric algebraic Riccati equations, while M-matrix Lyapunov equation, i.e., $ B=A^T $, arises in positive systems (see [6]).
The following are some definitions and lemmas, which are mainly from [7].
Let $ A=(a_{ij})\in\mathbb{R}^{m\times n} $, then $ A $ is called a nonnegative matrix if $ a_{ij}\geq 0 $ for all $ i, j $. For $ A=(a_{ij}), B=(b_{ij})\in\mathbb{R}^{m\times n} $, we write $ A\geq B(A> B) $, if $ a_{ij}\geq b_{ij}(a_{ij}>b_{ij}) $ for all $ i, j $. A square matrix $ A=(a_{ij})\in\mathbb{R}^{n\times n} $ is called a Z-matrix, if $ a_{ij}\leq0 $ for all $ i\neq j $. Any Z-matrix $ A $ can be written as $ A=sI-B $, where $ s $ is a scalar and $ B $ is a nonnegative matrix. In addition, $ A $ is called an M-matrix if $ s\geq\rho(B) $, where $ \rho(B) $ is the spectral radius of $ B $. In particular, $ A $ is called a nonsingular M-matrix if $ s>\rho(B) $ and singular M-matrix if $ s=\rho(B) $.
Lemma 1.1 Let $ A\in\mathbb{R}^{n\times n} $ be a Z-matrix. Then the following three statements are equivalent:
(1) $ A $ is a nonsingular M-matrix;
(2) $ A^{-1}\geq 0 $;
(3) $ Av > 0 $ for some vectors $ v>0 $.
Lemma 1.2 Let $ A $ and $ B $ be Z-matrices. If $ A $ is a nonsingular M-matrix and $ A\leq B $, then $ B $ is also a nonsingular M-matrix. In particular, for any $ \alpha\geq 0 $, $ B=\alpha I+A $ is a nonsingular M-matrix.
The rest of the paper is organized as follows. In Section 2, we review some iteration methods for solving M-matrix Sylvester equations. In Section 3, we propose a class of Smith-like iteration method and give convergence analysis of it. In Section 4, we use some numerical examples to show the feasibility and effectiveness of the new method. Conclusion is given in Section 5.
Due to the wide applications of Sylvester equations, many efficient numerical methods have been proposed for solving them. The basic method is Bartels-Stewart algorithm, which is very effective for small-scale problems (see [8]). However, when the problem scale is large, iterative methods are more effective than direct methods. The common iterative methods for solving Sylvester equation include Smith method [9], alternating direction method [10], Krylov subspace method [11], and etc. For some recent development, see [12-17] for details.
Smith method is a classical iterative method for solving Sylvester equations, which is quadratically convergent. For M-matrix Sylvester equation, it is constructed as follows (see [9, 18] for details). For a fixed scalar $ \mu>0 $, equation (1.1) can be written as
thus we have $ X=X_0+E_0XF_0, $ where
The solution of (1.1) admits the following series expansion $ X=\sum\limits_{i=0}^{\infty}E_0^iX_0F_0^i $ which can be quickly approximated by $ X_{k+1}=X_k+E_0^{2^k}X_kF_0^{2^k}, k\geq 0$.
Algorithm 1: Smith method
(1) Choose $ \mu=\max\{a_{ii}, b_{ii}\} $;
(2) Compute
(3) For $ k=0, 1, \cdots $ until convergence, compute
Recently, Wang et al proposed an alternating directional Smith method in [18], which is differ to Smith method only in the initial settings. It is constructed as follows (see [18] for details). Choose $ \alpha, \beta>0 $, and write equation (1.1) as $ (\beta I+A)X(\alpha I+B)-(\alpha I-A)X(\beta I-B)=(\alpha+\beta)C, $ thus we have $ X=X_0+E_0XF_0, $ where
The solution of (1.1) admits the following series expansion
which can be quickly approximated by
The rest of the method is similar to Smith method.
Algorithm 2: Alternating directional Smith method
(1) Choose $ \alpha=\max\{a_{ii}\} $, $ \beta=\max\{b_{ii}\} $;
Compared with Smith method, the alternating directional Smith method has smaller convergence factor, so it usually converges faster. However, the alternating directional Smith method still has room for improvement. In this paper, we will propose a class of Smith-like method to solve M-matrix Sylvester equations. Compared with Smith method and alternating directional Smith method, the new method has less initial computation counts and thus has better numerical behaviours under certain conditions.
In this section, based on the properties of M-matrix and the idea of Smith method, we propose a class of Smith-like iteration method to solve M-matrix Sylvester equations, and give convergence analysis of the new method.
Choose parameter $ \alpha>0 $, and write equation (1.1) as $ X(\alpha I+B)=(\alpha I-A)X+C, $ then we have $ X=X_0+E_0XF_0, $ where
The solution can be written as
As in the Smith method, we can approximate the solution by the following iteration:
Compared with Smith method and alternating directional Smith method, the above method only needs to calculate the inverse of one matrix in the initial condition (3.1), so the amount of computation is small. In the bellow, we give the convergence analysis of the method (3.1)-(3.2).
Lemma 3.1 For equation (1.1), if the parameter $ \alpha $ satisfies $ \alpha\geq \{a_{ii}\} $, then $ X_0, E_0, F_0 $ defined in (3.1) are all nonnegative.
Proof Since $ A $ is an M-matrix, and $ \alpha\geq \{a_{ii}\} $, we know that $ E_0=\alpha I-A\geq 0 $. On the other hand, since $ B $ is an M-matrix, and $ \alpha>0 $, we can conclude from Lemma 1.2 that $ \alpha I+B $ is a nonsingular M-matrix. Thus from Lemma 1.1, we have $ F_0=(\alpha I+B)^{-1}\geq 0 $. Similarly, we have $ X_0=C(\alpha I+B)^{-1}\geq 0 $.
Theorem 3.1 For equation (1.1), if the parameter $ \alpha $ satisfies $ \alpha\geq \{a_{ii}\} $, then the sequence $ \{X_k\} $ generated by (3.2) is nonnegative, monotonically increasing and converges to $ X $. In addition, the convergent rate is quadratic, and the convergent factor is given by
where $ \lambda_{min}(A) $, $ \lambda_{min}(B) $ are the minimum nonnegative eigenvalues of $ A, B $ respectively.
Proof From Lemma 3.1 and the iteration (3.2), it is easy to verify that the sequence $ \{X_k\} $ is nonnegative, monotonically increasing. In addition, we can conclude that
Thus we have
Taking norm on both sides and noting that $ \lim\limits_{k\rightarrow \infty}\sqrt[k]{\|A^k\|}=\rho(A) $, we have
Since $ E_0, F_0 $ are nonnegative, it is easy to conclude that
In addition, since $ A $ and $ B $ are M-matrices with at least one of them be nonsingular, we have $ \frac{\alpha-\lambda_{min}(A)}{\alpha+\lambda_{min}(B)}<1 $. Thus the convergent rate of (3.2) is quadratic.
Corollary 3.1 Let the assumptions be as in Theorem 3.1, then the optimal parameter in (3.1)-(3.2) is $ \alpha=\{a_{ii}\} $.
Proof Since the convergent factor is given by $ \frac{\alpha-\lambda_{min}(A)}{\alpha+\lambda_{min}(B)} $, it is clear that the smaller the parameter be, and the smaller the convergence factor will be. Thus the optimal parameter is $ \alpha=\{a_{ii}\} $.
Similarly, if we choose parameter $ \beta>0 $, equation (1.1) can be written as
and we can have
where
And similarly, we can reach to the following iteration:
We have similar convergence results for the above iterative method (3.3)-(3.4).
Theorem 3.2 For equation (1.1), if the parameter $ \beta $ satisfies $ \beta\geq \{b_{ii}\} $, then the sequence $ \{X_k\} $ generated by (3.4) is nonnegative, monotonically increasing and converges to $ X $. In addition, the convergent rate is quadratic, and the convergent factor is
Corollary 3.2 Let the assumptions be as in Theorem 3.2, then the optimal parameter in (3.3)-(3.4) is $ \beta=\{b_{ii}\} $.
From the convergence analysis of the above iterative methods (3.2) and (3.4), it can be seen that the convergence factors of the two iterative methods are generally different. In order to achieve faster convergence in practical calculation, we consider both the numerical effects of the two iterative methods and propose the following algorithm.
Algorithm 3: Smith-like iteration method
(1) Compute $ \alpha=\max\{a_{ii}\} $, $ \beta=\max\{b_{ii}\} $, where $ a_{ii}, b_{ii} $ are the diagonal entries of $ A, B $ respectively;
(2) If $ \alpha\leq \beta $, compute according to (3.1)-(3.2);
(3) If $ \alpha> \beta $, compute according to (3.3)-(3.4).
In this section we use several examples to show the feasibility and effectiveness of the Smith-like method. We will compare the numerical behaviours of the Smith method, the alternating directional Smith method, and the Smith-like iteration method, denoted by Alg 1, Alg 2, Alg 3 respectively. The numerical results are presented in terms of the numbers of iterations (IT), CPU time (CPU) in seconds and the residue (RES), where
In our implementations all iterations are performed in MATLAB (version R2012a) on a personal computer and are terminated when the current iterate satisfies $ RES<10^{-12} $.
Example 4.1 Consider equation (1.1), where
Numerical results are reported in Table 1 for this example. From Table 1, we can conclude that all the three methods can compute the solution as desired accuracy, and all have better numerical behaviours. In addition, the number of iterations required by Algorithm 2 is slightly smaller, and the CPU time of Algorithm 3 is slightly less.
Example 4.2 Consider equation (1.1), where
Numerical results are reported in Table 2. For this problem, all the three methods can compute the solution as desired accuracy, and the numerical behaviours of the latter two methods are better. In particular, the CPU time of Algorithm 3 is slightly less than that of Algorithm 2.
Example 4.3 Consider equation (1.1) with
Here we take $ n=100 $. For different values of parameter $ \omega $, the numerical results are shown in Table 3. It can be seen from the table that the numerical effects of the three methods are all well. In particular, when $ \alpha=\max\{a_{ii}\} $ and $ \beta=\max\{b_{ii}\} $ differ greatly, Algorithm 3 requires the lest CPU time.
Example 4.4 Consider equation (1.1) with
and $ C=I $. For different values of $ n $, the numerical results are shown in Table 4. It can be seen that the numerical effects of the three methods are all well. In particular, Algorithm 3 requires the lest CPU time.
From the above four examples, it can be seen that Algorithm 3 is feasible. In particular, when $ \alpha=\max\{a_{ii}\} $ and $ \beta=\max\{b_{ii}\} $ differ greatly, Algorithm 3 requires the lest CPU time, so it is also effective.
In this paper, the numerical solution of M-matrix Sylvester equations is studied, and a class of Smith-like iteration method is proposed to solve the equation. Convergence analysis and numerical examples show that the new method is feasible. In addition, it is effective under certain conditions.