Non-negative matrix factorization (NMF) [1] is different to other factorization methods because non-negative constraints are imposed. This makes factorization results more sense in the real word.
NMF attempts to decompose a given nonnegative matrix $A$ into the product of a non-negative basis matrix $W$ and a non-negative coefficient matrix $H$, which can be interpreted as follow
where $A \in R^{m\times n}$, $W \in R^{m\times r}$ and $H \in R^{r\times n}$. The factorization results show that NMF can generate a reduced representation of the original date. In this sense, NMF is useful for extracting parts-based representations.
In recent years, NMF, as a parts-based representation of data, received considerable attentions in machine learning, signal processing, computer vision and so on [2-5].
In order to decrease the approximation error between $A$ and $W H$, NMF be expressed as a constrained optimization problem
where $\|\cdot\|$ is the Frobenius norm, and $W\geq0~ (H\geq0)$ means that all elements of the matrix $W (H)$ are nonnegative. Kush-Kuhn-Tucker (KKT) [6] conditions for problem (1.2) are expressed as follows
Alternating non-negative least squares (ANLS) [2, 7, 8] works well in practice, whose convergent speed is fast and elements not be locked. Iterate the following until a stopping criterion is satisfied
Clearly, $F(W^k, H)$ and $F(W, H^{k+1})$ is convex, respectively.
ANLS is widely used as an efficient computational method for NMF, but the ANLS has a serious problem that its global convergence is not guaranteed in theory. In other words, for any initial solution, the sequence of ANLS contains at least one limit point and this limit point is a stationary point of NMF. The main difficulty in proving global convergence is that the existence of the limit point.
In [9], authors proposed a modified strategy and applied it to ANLS. The modified strategy is valid for proving global convergence. Motivated by the works of [9], we present a modified strategy to guarantee the existence of the limit point, and the limit point is a stationary point of NMF. In addition, we give generalized modified strategies. We can apply it in reality.
The rest of this paper is organized as follows. Our modified strategy is given in Section 2. Convergence analysis is established in Section 3. In Section 4, we give two generalized modified strategies. Finally, we conclude this paper in Section 5.
We sum up here briefly our notations. Let
Let $B=R^{m\times n}$, $\langle A, B\rangle={\rm Tr}(AB^T)$. $A_{\cdot i}$ denotes the $i$-th column of $A$, $A_{i\cdot}$ denotes the $i$-th row of $A$. Let $\|x\|$ denote any norm of a vector, and $\|x\|_{2}$ denote 2-norm of a vector $x$.
In this section, based on the literature [9], we will present our modified strategy. Let $(W^{k}, H^{k})$ be generated by alternating non-negative least squares (ANLS).
First, we state the modified strategy of [9] be given as follows
where $\Lambda_{1}={\rm diag}(\lambda_{j}^{1})$, $\Lambda_{2}={\rm diag}(\lambda_{j}^{2})$, $j=1, 2, \cdots, r, $
where $\alpha$ is a positive constant. The above strategy can ensure ANLS is convergent.
Motivated by the work of [9], we give another modified strategy. In NMF, we have the fact that
For $\|W_{\cdot j}\|\cdot\|H_{j \cdot}\|\neq0$, let $\bar{W}_{\cdot j}=\lambda_{j}W_{\cdot j}, ~~\bar{H}_{j\cdot}=\frac{1}{\lambda_{j}}H_{j \cdot}.$ Furthermore, $\|\bar{W}_{\cdot j}\|=\|\bar{H}_{j\cdot}\|+\alpha$ ($\alpha>0$). We have
Hence, our modified strategy be described as follows: $\bar{W}^{k}=W^{k}\Lambda_{1}, \bar{H}^{k}=\Lambda_{2}H^{k}, $ where $\Lambda_{1}={\rm diag}(\lambda_{j}^{1})$, $\Lambda_{2}={\rm diag}(\lambda_{j}^{2})$, $j=1, 2, \cdots, r, $
Based on the above analysis, in this section, we report ANLS with Strategy 1 for NMF as follows.
Algorithm 1 (s.1) Give the starting point $(\bar{W}^{0}, \bar{H}^{0})\geq0$ and $\alpha\geq0$. Set $k=1$.
(s.2) Stop if $(\bar{W}^{k}, \bar{H}^{k})$ satisfies (1.3),
(s.3) Modified $(W^{k+1}, H^{k+1})$ to $(\bar{W}^{k+1}, \bar{H}^{k+1})$ using Strategy 1.
(s.4) Let $k=k+1$. Go to s.2.
Similar to [9], the function of (1.4) ((1.5)) has a global minimizer on $R^{n\times r}_{+}$. In this section, we come to study the global convergence of the algorithm.
Lemma 4.1 Let $\{\bar{W}^{k}, \bar{H}^{k}\}$ be generated by Algorithm 1. Then $\{\bar{W}^{k}, \bar{H}^{k}\}$ is bounded.
Proof $H^{k}$ is a stationary of $F(W^{k}, H)$, we have $\langle(W^{k})^{T}(W^{k}H^{k}-A), H^{k}=0\rangle, $
Therefore, $\|\bar{W}^{k}\bar{H}^{k}\|_{F}=\|W^{k}H^{k}\|_{F}\leq\|A\|_{F}.$
Based on $\|\bar{W}_{\cdot j}\|=\|\bar{H}_{j\cdot}\|+\alpha$, we deduce $\|\bar{W}_{\cdot j}^{k}\|=\|\bar{H}_{j\cdot}^{k}\|+\alpha$. Meanwhile there exist $c_{1}, c_{2}, c_{3}, c_{4}>0$, such that
We can get
and
Furthermore,
where
Suppose $\sum\limits^{r}_{j=1}\|\bar{H}_{j\cdot}^{k}\|_{2}^{2}=m$, we have $c^{2}_{3}\cdot\frac{m^{2}}{r}+\alpha^{2}\cdot m\leq c_{1}^{2}\|A\|^{2}_{F}.$ By the nature of the quadratic function, there must be a positive constant $M$, such that $m\leq M.$ Hence,
Similarly, we can get the conclusion that $\|\bar{W}^{k}\|_{F}$ is bounded. Therefore, $\{\bar{W}^{k}, \bar{H}^{k}\}$ is bounded.
Lemma 4.2 Let $\{\bar{W}^{k}, \bar{H}^{k}\}$ be generated by Algorithm 1, and $(W^{*}, H^{*})$ be a limit point of $\{\bar{W}^{k}, \bar{H}^{k}\}$. Then $(W^{*}, H^{*})$ is a stationary point of problem (1.2).
Since the above theorem corresponds to [9, Theorem 2] and the proof is the same as that in [9, Theorem 2], we will omit it here.
From the above results, $\alpha$ is a positive constant no matter Strategy 1 or the strategy of [9]. We might as well replace $\alpha$ with $g(\alpha)$, in which $g(\alpha)$ is a function of $\alpha$ and $g(\alpha)>0$. Thus, we can get two generalized modified strategies.
The convergence of ANLS also is established when $g(\alpha)>0$. In reality, we can let $g(\alpha)=\frac{1}{\alpha^{2}}$ or $g(\alpha)=e^{-\alpha}$ and so on. For different application problems, the choose of $g(\alpha)$ is different.
In this section, we give some numerical experiments for the proposed modified strategies. We apply these strategies to the following algorithms: SBBNMF [9], BBNMF [10] and PGNMF [11]. All codes of the computer procedures are written in MATLAB and run in MATLAB 7.10, and are carried out on a PC (CPU 2.13GHz, 2G memory) with Windows 7 operation system environment.
In experiments, we will compare the following statistics: the number of inner iterations (Iter), the number of outer iterations (Oter), the objective function value (Fun), the residual function value (Res), and the CPU time in second (Time). The Res formula is similar to [9], and the initial parameters
In order to avoid the initial point affect the numerical result, in every experiment, we use 30 initial points, and every initial point is randomly generated. We list the following four experimental tables for different $\alpha$.
The three tables indicates the proposed modified strategies are utility for these algorithms (SBBNMF, BBNMF, and PGNMF) of ANLS framework. In some experiments, Iter (Oter) is not an integer, because we test the average value the 30 initial point. In other words, Iter (Oter) represents the average number of iterations.
Nonnegative matrix factorization (NMF) is not only a well-known matrix decomposition approach but also an utility and efficient feature extraction technique. In this paper, we propose a modified strategy, which can ensure the convergence of ANLS. In addition, we give generalized modified strategies.