RESEARCH ANNOUNCEMENTS ON "DPK: DEEP NEURAL NETWORK APPROXIMATION OF THE FIRST PIOLA-KIRCHHOFF STRESS"


扩展功能
	加入收藏夹

	复制引文信息

	加入引用管理器

	Email Alert

	RSS
本文作者相关文章
	YANG Zhi-jian

	YUAN Cheng

YANG Zhi-jian, YUAN Cheng

School of Mathematics and Statistics, Wuhan University, Wuhan 430072, China

Foundation item: Supported by National Key Research and Development Program of China (2020YFA0714200), National Nature Science Foundation of China (12125103; 12071362), Natural Science Foundation of Hubei Province (2021AAA010; 2019CFA007) and Fundamental Research Funds for the Central Universities.

1 Introduction and Main Results

Consider the first Piola-Kirchhoff stress $ {\bf{P}} $ defined as [1]

$ \begin{equation} {\bf{P}}({\bf{F}}) = \frac{\partial W({\bf{F}})}{\partial {\bf{F}}}, \end{equation} $

(1.1)

where $ {\bf{F}} $ and $ W $ refer to the deformation tensor and strain energy density. Due to the principle of material frame-indifference and the material symmetry, two important constraints should be satisfied by these relations [1]:

$ \begin{eqnarray} {\bf{P}}({\bf{Q}}{\bf{F}}) &=& {\bf{Q}}{\bf{P}}({\bf{F}}), \quad \forall {\bf{Q}} \in SO(3); \end{eqnarray} $

(1.2)

$ \begin{eqnarray} {\bf{P}}({\bf{F}}) &=& {\bf{P}}({\bf{F}}{\bf {H}}){\bf {H}}^{T}, \quad \forall {\bf {H}} \in G, \end{eqnarray} $

(1.3)

where $ SO(3) $ stands for the proper orthogonal group and $ G $ refers to the material symmetry group.

In this work, we propose a deep symmetric network $ {\bf{P}}_{\theta}({\bf {F}}) $ to represent the first PK stress (DPK). By symmetric we mean the network can exactly satisfy the two constraints (1.2),(1.3). The neural network enables us to construct the constitutive relation based on both macroscopic observations and atomistic simulation data.

To begin with, we first review the traditional atomistic PK stress (virial stress) defined as

$ \begin{equation} {\bf{P}}({\bf {F}}) = \frac{1}{\Omega}\frac{\partial V(|{\bf{F}}{\bf {X}}_{12}|, ..., |{\bf{F}}{\bf {X}}_{ij}|, ...)}{\partial {\bf{F}}} = - \frac{1}{2\Omega}\sum\limits_{i}\sum\limits_{j\neq i}{\bf {f}}_{ij}\otimes {\bf {X}}_{ij}, \end{equation} $

(1.4)

in which $ {\bf {X}}_{ij} $ and $ {\bf {f}}_{ij} $ refer to the relative displacement and pair force between the $ i- $th and $ j- $th atom, and $ V $ stands for the potential function of atom system. Under several assumption on the symmetry of the potential and the crystal structure, we can prove the virial stress is exactly in conformity with the two constraints (1.2),(1.3):

Theorem 1.1 (Symmetry of virial stress) For a given crystal system with periodic boundary condition, suppose the potential $ V $ is a symmetric function, let $ {\bf{P}}({\bf {F}}) $ be the stress defined by (1.4), then the constraint (1.2) holds for any $ {\bf {Q}} $ in $ SO(3) $, and (1.3) holds for any $ {\bf {H}} $ in the crystallographic symmetry group (point symmetry group $ G_p $, [2]).

Motivated by the above theorem, we design the deep symmetry network as follows. For a given crystal material, we denote the crystallographic point symmetry group as $ G_p $. For any finite 3-dimensional trainable weight vectors $ {\bf {W}} = \{{\bf {w}}_1, ..., {\bf {w}}_N\} $, the symmetrized weights $ \tilde{{\bf {W}}} $ is defined as the image set of $ G_p $ on $ {\bf {W}} $:

$ \begin{equation} \tilde{{\bf {W}}} = \{{\bf {H}}_i {\bf {w}}_j | {\bf {H}}_i \in G_p, {\bf {w}}_j \in {\bf {W}}\}. \end{equation} $

(1.5)

By construction $ \tilde{{\bf {W}}} = \{\tilde{{\bf {w}}}_1, ..., \tilde{{\bf {w}}}_M\} $ is an invariant subset of $ G_p $. With this we define the deep symmetry network as follows:

Definition 1.1 (DPK) Let $ {\bf {W}} = \{{\bf {w}}_1, ..., {\bf {w}}_N\} $ be N trainable 3-dimensional weight vectors and $ \tilde{{\bf {W}}} $ the symmetrized weights defined above, denote $ g({\bf {x}};\theta) = h_n \circ a_{n-1} \circ ... \circ a_1 \circ h_1({\bf {x}}) $ as the classical dense neural network (DNN) with linear functions $ h_i({\bf {x}};\theta_i) $ and activation function $ a_i({\bf {x}}) $, the deep symmetric network for PK stress is defined as

$ \begin{equation} {\bf{P}}_{\theta, {\bf {W}}}({\bf {F}}) = {\bf {F}}\sum\limits_{i=1}^{M} \frac{\partial f}{\partial x_i} (|{\bf {F}} \tilde{{\bf {w}}}_1|, ..., |{\bf {F}}\tilde{{\bf {w}}}_M|; \theta)\frac{1}{|{\bf {F}}\tilde{{\bf {w}}}_i|} \tilde{{\bf {w}}}_i\otimes \tilde{{\bf {w}}}_i, \end{equation} $

(1.6)

where $ f({\bf {x}}, \theta) = \frac{1}{M!}\sum_{\tau \in S_M} g(\tau({\bf {x}}); \theta) $ with $ S_M $ stands for all the permutations of $ [1, 2, ..., M] $.

The architecture of this network is illustrated in Fig. 1. By using the fact that $ \tilde{{\bf {W}}} $ is an invariant subset of $ G_p $ and $ f({\bf {x}}, \theta) = \frac{1}{M!}\sum_{\tau \in S_M} g(\tau({\bf {x}}); \theta) $ is a symmetric function, we obtain that the DPK is a symmetric stress:

Figure 1 The network architecture of DPK.

Theorem 1.2 (Symmetry of DPK) For a given crystal system with crystallographic symmetry group $ G_p $, let $ {\bf{P}}_{\theta, {\bf {W}}}({\bf {F}}) $ be the DPK stress defined by (1.6), then the constraint (1.2) holds for any $ {\bf {Q}} $ in $ SO(3) $, and (1.3) holds for any $ {\bf {H}} $ in $ G_p $.

Furthermore, by applying some existing result on the approximation of DNN, we show that the DPK stress owns good approximation ability:

Theorem 1.3 (Approximation of DPK) For a finite crystal system consists of $ N $ atoms with symmetric potential $ V $ satisfying $ \|{\bf {X}}_{ij}\|_2 <1 $ for any $ j \neq i $, if there exists some $ n>1, B< \infty $ such that $ \|V\|_{W^{n, \infty}((0, 1)^{N(N-1)})} < B $, then for any $ \epsilon \in (0, 1/2) $ there exists a DPK function $ {\bf{P}}_0 $ such that

$ \begin{equation} \|{\bf{P}}_{virial}({\bf {F}}) - {\bf{P}}_0({\bf {F}})\|_F \leq \epsilon, \end{equation} $

(1.7)

if $ \|{\bf {FX}}_{ij}\|_2 <1 $. Here $ \|\cdot\|_F $ represents the Frobenius norm and $ {P}_{virial}({\bf {F}}) $ is the virial stress defined by (1.4).

Remark 1 Although the DPK architecture introduced above supplies us with a strong approximation tool which can exactly satisfy the symmetry conditions, the sum term $ f({\bf {x}}, \theta) = \frac{1}{M!}\sum_{\tau \in S_M} g(\tau({\bf {x}}); \theta) $ with a large $ M $ will introduce an intolerable computation complexity in both training and inference procedure.

In order to achieve a simplified architecture while keeping its symmetry and approximation capability, we learn from the idea of quasi-harmonic approximation [3] method and decompose the virtual potential function $ f $ in (1.6) as

$ \begin{equation} f(x_1, x_2, ..., x_M;\theta) = \sum\limits_{i=1}^{M}g(x_i;\theta) \end{equation} $

(1.8)

where $ g(x) = h_n \circ a_{n-1} \circ ... \circ a_1 \circ h_1(x) $ is the classical DNN function. Applying (1.8) to (1.6), we can obtain the modified DPK architecture as:

$ \begin{equation} {\bf{P}}_{\theta, {\bf {W}}}({\bf {F}}) = {\bf {F}}\sum\limits_{i=1}^{M} \frac{d g}{dx}(|{\bf {F}} \tilde{{\bf {w}}}_i |; \theta)\frac{1}{|{\bf {F}} \tilde{{\bf {w}}}_i |} \tilde{{\bf {w}}}_i\otimes \tilde{{\bf {w}}}_i. \end{equation} $

(1.9)

A further simplification can be achieved once we represent $ \frac{1}{x}g^\prime(x;\theta) = \tilde{g}(x;\tilde{\theta}) $ where $ \tilde{g}(x;\tilde{\theta}) $ is a new DNN function. In this case the simplified DPK can be written as:

$ \begin{equation} {\bf{P}}_{\theta, {\bf {W}}}({\bf {F}}) = {\bf {F}}\sum\limits_{i=1}^{M} \tilde{g}(|{\bf {F}} \tilde{{\bf {w}}}_i |; \tilde{\theta}) \tilde{{\bf {w}}}_i\otimes \tilde{{\bf {w}}}_i. \end{equation} $

(1.10)

We numerically illustrate the effectiveness and transferability of (1.10). Additionally, with some classical tools in learning theory such as fat-shattering dimension, we prove that the estimation error (generalization error) of (1.10) can be arbitrarily small with sufficient large number of training samples:

Theorem 1.4 (Learnability of simplified DPK) Let $ \mathcal{A} = \{{\bf{P}}_{\theta, {\bf {W}}}\} $ be the class of DPK functions defined by (1.10) satisfying that $ \tilde{g}(x;\tilde{\theta}) $ has $ l\geq 2 $ layers, $ N_w $ weights (bounded by $ V $), Lipschitz continuous activation function with Lipschitz coefficient $ L > 1/V $ and each unit maps into $ [-b, b] $ for some $ b>0 $. Given the training samples $ S_m = \{({\bf {F}}^1, {\bf{P}}^1), ..., ({\bf {F}}^m, {\bf{P}}^m)\} $ according to independent identical distribution $ D $ on $ [-1, 1]^{3\times3} \times [-1, 1]^{3\times3} $ and loss function

$ \begin{equation} l(f({\bf {X}}), {\bf {Y}}) = \frac{1}{9}\sum\limits_{i, j}l_0(f_{ij}({\bf {X}}), {\bf {Y}}_{ij}), \text{ for } f \in \mathcal{A}, \end{equation} $

(1.11)

where $ l_0(x, y) = |x-y| $, then for any approximate sample error minimization (SME, [4]) algorithm $ \mathbb{M} $ and $ \epsilon \leq 2max\{b, V, \sqrt{3V} \}, \delta >0 $, there exits some $ m_0(\epsilon, \delta, \mathcal{A}) < \infty $ such that $ \forall m>m_0 $, with probability at least $ 1-\delta $, $ \mathbb{M} $ returns a function $ P^* \in \mathcal{A} $ satisfying

$ \begin{equation} E_{(x, y)\sim D}[l(P^*(x), y)] < \inf\limits_{f \in \mathcal{A}}E_{(x, y)\sim D}[l(f(x), y)] + \epsilon. \end{equation} $

(1.12)

Remark 2 In practice, the conditions of $ \tilde{g} $ are indeed satisfied since we will use the ReLU as activation function and bound the network weights by using an $ l_1 $ regularization.

The detailed proof of the theorems mentioned above and the numerical results can be found in [5].

References

[1]	Tadmor B Ellad, Miller E Ronald. Modeling materials: continuum, atomistic and multiscale techniques[M]. Cambridge: Cambridge University Press, 2011.
[2]	Coleman D Bernard, Noll Walter. Material symmetry and thermostatic inequalities in finite elastic deformations[J]. Archive for Rational Mechanics and Analysis, 1964, 15(2): 87-111. DOI:10.1007/BF00249520
[3]	Dai Shuyang, Wang Fengru, Yang Jerry Zhijian, Yuan Cheng. On the cauchy-born approximation at finite temperature for alloys[J]. Discrete and Continuous Dynamical Systems-B, 2021, 27(6): 3131-3153.
[4]	Anthony Martin, Bartlett L Peter. Neural network learning: Theoretical foundations[M]. Cambridge: Cambridge University Press, 1999.
[5]	Hu Tianyi, Yang Jerry Zhijian, Yuan Cheng. Dpk: Deep neural network approximation of the first piola-kirchhoff stress[J]. Advances in Applied Mathematics and Mechanics, Submitted.