Real-time problems of solving the large-scale simultaneous linear equations, which is in the sense of the minimal norm, often occur in the areas such as optimization, system identification, intelligent control, optimal control, robotics and control [1-4]. For many large-scale simultaneous linear equations, the number of unknows involved is very large. These large-scale problems often need to be solved in real time.
Neural networks were proposed for solving real-time problems involved simultaneous equations, including linear cases [5-10]. We develop an electronic neural network called the gradient Hopfield neural network for solving the simultaneous linear equations. Meanwhile, we consider the case which will be errors such as circuit noise, the opamp finite gains and frequent correlations. Assume the errors are bounded and uncertain, then this kind of errors can be understood as system parameter perturbations caused by the disturbed effects or the integration effects from the internal and the external of the entire system. So, as to such a Hopfield neural network whose disturbance gradient is bounded and uncertain, the first thing should be considering its robustness. That is to say, under bounded conditions, for any value of the disturbance, the errors between the effective solution and the exact solution of the model are consistent or convergent [4].
The paper will study the robustness of disturbance gradient Hopfield neural network from the theory and will offer numerical simulation of some instances to check the validity of the theory. As follows: In Section 2, the disturbance gradient neural network. In Section 3, the robustness analysis. In Section 4, Matlab simulation of robustness. In Section 5, conclusion.
Based on LMS $AX=b$ errors minimum problems, the negative gradient neural network model (theoretical model) [4] with non-linear activation function can be described as follows:
where $A=(a_{ij})_{m\times n}\in R^{m\times n}, $ $x=(x_{1}, x_{2}, \cdots, x_{n})^{T}\in R^{n}, $ $b=(b_{1}, b_{2}, \cdots, b_{m})^{T}\in R^{m}.$ And the matrix $A$ is full row rank, namely rank$(A)=m$. Parameter $\alpha>0$ determines the convergence of the network.
Non-linear activation function $f(\cdot)$ usually takes the following types.
(1) The linear activation function $f(u)=u.$
(2) Bipolar sigmoid activation function
(3) Power activation function
and odd.
(4) Power-sigmoid activation function
Remark 1 For the power-sigmoid and bipolar sigmoid activation function, at any fixed point $y=(y_{1}, y_{2}, \cdots, y_{m})^{T}\in R^{m}$, Jacobi matrix $\frac{\partial F(y)}{\partial y}={\rm diag} (f'(y_{1}), f'(y_{2}), \cdots, f'(y_{m}))$ is a diagonal positive definite matrix.
In neural network (2.1), the negative gradient neural network model (actual model) with parameter disturbance can be described as
Or the negative gradient neural network model (actual model) with overall disturbance can be described as
where $\triangle A(t)\in R^{m\times n}$ and $\triangle b(t)\in R^{m}$ are circuit implementation errors or parameter disturbance of the matrix $A$ and $b$, respectively, $\triangle c(t)\in R^{n}$ is the error for the non-accurate implementation of the overall model and there exists a positive number to make the consistently bounded disturbance conditions meet
First, we give two theorems to character convergence rate properties and convergence properties of equilibrium point of the disturbed Hopfield neural network.
Theorem 2[4] The convergence rate of non-linear neural network system (2.1) depends on parameter and activation functions. The greater $\alpha$ is, the faster network converge. And
(1) When using the linear activation function, the neural network (2.1) globally exponential converges.
(2) When using the Bipolar-sigmoid activation function, in the interval $[-c, c]\subset [-1, 1]$, the neural network (2.1) converges faster than the linear activation function, where $c$ is the intersection abscissa value of the bipolar sigmoid curve and the linear $f(u)=u$.
(3) When using the power activation function, within the interval $ (-\infty, -1]$ and $[1, +\infty)$, neural network converges faster than the linear activation function.
(4) When using the power-sigmoid activation function, in the interval $(-\infty, +\infty)$, neural network converges faster than the linear activation function.
The following content will focus on the robustness of system (2.3).
Theorem 3 For the disturbed network system (2.3), as long as the parameter $\alpha$ is chosen large enough, it's easy to get that the steady-state error of the network model $\|x(t)-x^{*}\|\rightarrow 0, $ $t\rightarrow \infty$. In other words, when the parameter is chosen appropriately, even if the implementation errors are large, the network still converges to the exact solution of the network system (2.1).
Proof Note $x^{*}$ as the exact solution of the network system (2.1), $x(t)$ as the disturbance solution of network (2.3) at any initial value. Then two equations can be founded
Subtracting the two formulas above, note $z=x-x^{*}$ as dynamic error, $y^{*}=Ax^{*}-b$ as residuals, then
When $\|Az\|$ is small enough, $F(y^{*}+Az-F(y^{*})\approx\frac{\partial F(y^{*})}{\partial y}\cdot Az$ can be achieved. So the partial linear system of formula (3.1) at the point $y^{*}$ is
By Remark 1, we have $\frac{\partial F(y^{*})}{\partial y}={\rm diag} (f'(y_{1}^{*}), f'(y_{2}^{*}), \cdots, f'(y_{n}^{*}))$ is a positive definite matrix. Since $A$ is a full row rank matrix, it's not difficult to prove that $A^{T}\frac{\partial F(y^{*})}{\partial y}A$ also is a positive definite matrix. This is because $\forall x\in R^{n}$, there always has
When and only when $A_{i}x=0$, namely, $Ax=0$, the equation can be achieved as follow:
where $A_{i}$ stands for the $i$ row of $A_{i}$. Since $A$ is a full row matrix, so it can be equivalent to $x=0$.
Also because the disturbance $\triangle c(t)$ satisfies the bounded condition (2.4), by the linear system theory, when the parameter $\alpha$ is large enough, the eigenvalues of system (3.2) will stay in the left half complex plane and away from the imaginary axis. Thus system (3.2) converges, that is $z=x-x^{*}\rightarrow 0$. Theorem 3 is proved.
Remark 2 By the proof of theorem, we know the key condition of theorem is that the matrix $A^{T}\frac{\partial F(y^{*})}{\partial y}A$ is a positive definite matrix, which needs
is a positive definite diagonal matrix. The power-sigmoid and sigmoid activation functions both can ensure the condition are met. Therefore, these two activation functions will have better performance than the linear activation function.
To test the robustness of the gradient-based neural network, we consider the following LMS problem
Corresponding to the power-sigmoid activation function of neural network system (2.1) when $\xi=4, p=3$. We have the following model implementation errors of sinusoidal form
Using Matlab internal function ode45, plot the solution curve of the network in figures 1-3, respectively, when $\alpha=1$, $\alpha=10$ and $\alpha=100$. Just as follows:
Figures 4-6 show us clearly that even when there are great errors, network (2.3) can run well and the error norm $\|x(t)-x^{*}\|$ is consistently bounded. At the same time, when $\alpha$ changes from 1 to 100, convergence time becomes faster and the errors fall down to a very small value when steady-state comes. The simulation result again confirms the correctness and effectiveness of the theoretical analysis in the paper.
Gradient-based Hopfield neural network (2.1) provides an effective real-time parallel computing to solve minimal norm of least mean square problems. By using different activation functions, the paper focuses on the disturbed neural network and analysis its robustness. The results show that when the activation function is power-sigmoid or bipolar-sigmoid form and the parameters are large enough, the disturbed network with implementation errors can operate well. Matlab simulation results reveal that neural network has the advantages of effectiveness and efficiency when it's used to solve minimal norm of the least mean square problems.