数学杂志  2019, Vol. 39 Issue (1): 137-146   PDF    
扩展功能
加入收藏夹
复制引文信息
加入引用管理器
Email Alert
RSS
本文作者相关文章
XU Ming-zhou
DING Yun-zheng
ZHOU Yong-zheng
CENTRAL LIMIT THEOREM AND MODERATE DEVIATION FOR NONHOMOGENENOUS MARKOV CHAINS
XU Ming-zhou, DING Yun-zheng, ZHOU Yong-zheng    
School of Information and Engineering, Jingdezhen Ceramic Institute, Jingdezhen 333403, China
Abstract: In this article, we study central limit theorem for countable nonhomogeneous Markov chain under the condition of uniform convergence of transition probability matrices for countable nonhomogeneous Markov chain in Cesàro sense. By Gärtner-Ellis theorem and exponential equivalent method, we obtain a corresponding moderate deviation theorem for countable nonhomogeneous Markov chain.
Keywords: central limit theorem     moderate deviation     nonhomogeneous Markov chain     martingle    
非时齐马氏链的中心极限定理和中偏差
徐明周, 丁云正, 周永正    
景德镇陶瓷大学信息工程学院, 江西 景德镇 333403
摘要:本文研究了非时齐可列马氏链当其转移概率矩阵在Cesàro意义下一致收敛时的中心极限定理的问题.利用指数等价和Gärtner-Ellis定理的方法,获得了相应的中偏差结果.
关键词中心极限定理    中偏差    非时齐马氏链        
1 Introduction

Huang et al. [1] proved central limit theorem for nonhomogeneous Markov chain with finite state space. Gao [2] obtained moderate deviation principles for homogeneous Markov chain. De Acosta [3] studied moderate deviations lower bounds for homogeneous Markov chain. De Acosta and Chen [4] established moderate deviations upper bounds for homogeneous Markov chain. It is natural and important to study central limit theorem and moderate deviation for countable nonhomogeneous Markov chain. We wish to investigate a central limit theorem and moderate deviation for countable nonhomogeneous Markov chain under the condition of uniform convergence of transition probability matrices for countable nonhomogeneous Markov chain in Cesàro sense.

Suppose that $ \{X_n, n\ge 0\} $ is a nonhomogeneous Markov chain taking values in $ S = \{1, 2, \cdots\} $ with initial probability

$ \begin{equation} \mu^{(0)} = (\mu(1), \mu(2), \cdots) \end{equation} $ (1.1)

and the transition matrices

$ \begin{equation} P_n = (p_n(i, j)), \mbox{ }i, j\in S, n\ge 1, \end{equation} $ (1.2)

where $ p_n(i, j) = \mathbb{P}(X_n = j|X_{n-1} = i) $. Write

$ \begin{eqnarray*} &&P^{(m, n)} = P_{m+1}P_{m+2}\cdots P_{n}, p^{(m, n)}(i, j) = \mathbb{P}(X_n = j|X_m = i), \\ &&\mu^{(k)} = \mu^{(0)}P_1P_2\cdots P_k, \mu^{(k)}(j) = \mathbb{P}(X_k = j). \end{eqnarray*} $

When the Markov chain is homogeneous, $ P $, $ P^k $ denote $ P_n $, $ P^{(m, m+k)} $, respectively.

If $ P $ is a stochastic matrix, then we write

$ \delta \left( P \right) = \mathop {\sup }\limits_{i, k} \sum\limits_{j = 1}^\infty {{{\left[ {p\left( {i, j} \right) - p\left( {k, j} \right)} \right]}^ + }, } $

where $ [a]^{+} = \max\{0, a\} $.

Let $ A = (a_{ij}) $ be a matrix defined as $ S\times S $. Write $ \|A\| = \sup\limits_{i\in S}\sum\limits_{j\in S}|a_{ij}|. $

If $ h = (h_1, h_2, \cdots) $, then we write $ \|h\| = \sum\limits_{j\in S}|h_j| $. If $ g = (g_1, g_2, \cdots)' $, then we write $ \|g\| = \sup\limits_{i\in S}|g_i| $. The properties below hold (see Yang [5, 6])

(a) $ \|AB\|\le \|A\|\|B\| $ for all matrices $ A $ and $ B $;

(b) $ \|P\| = 1 $ for all stochastic matrix $ P $.

Suppose that $ R $ is a 'constant' stochastic matrix each row of which is the same. Then $ \{P_n, n\ge 1\} $ is said to be strongly ergodic (with a constant stochastic matrix $ R $) if for all $ m\ge 0 $, $ \lim\limits_{n\rightarrow\infty}\|P^{(m, m+n)}-R\| = 0. $ The sequence $ \{P_n, n\ge 1\} $ is said to converge in the Cesàro sense (to a constant stochastic matrix $ R $) if for every $ m\ge 0 $,

$ \mathop {\lim }\limits_{n \to \infty } \left\| {\sum\limits_{t = 1}^n {{P^{(m, m + t)}}} /n - R} \right\| = 0. $

The sequence $ \{P_n, n\ge 1\} $ is said to uniformly converge in the Cesàro sense (to a constant stochastic matrix $ R $) if

$ \mathop {\lim }\limits_{n \to \infty } \mathop {\sup }\limits_{m \ge 0} \left\| {\sum\limits_{t = 1}^n {{P^{(m, m + t)}}} /n - R} \right\| = 0. $ (1.3)

$ S $ is divided into $ d $ disjoint subspaces $ C_0 $, $ C_1 $, $ \cdots $, $ C_{d-1} $, by an irreducible stochastic matrix $ P $, of period $ d $ ($ d\ge 1 $) (see Theorem 3.3 of Hu [7]), and $ P^d $ gives $ d $ stochastic matrices $ \{T_l, 0\le l\le d-1\} $, where $ T_l $ is defined on $ C_l $. As in Bowerman et al. [8] and Yang [5], we shall discuss such an irreducible stochastic matrix $ P $, of period $ d $ that $ T_l $ is strongly ergodic for $ l = 0, 1, \cdots, d-1 $. This matrix will be called periodic strongly ergodic.

Remark 1.1    If $ S = \{1, 2, \cdots\} $, $ d = 2 $, $ P = (p(i, j)) $, $ p(1, 2) = 1 $, $ p(k, k-1) = 1-p(k, k+1) = \frac{k-1}{k} $ for $ k\ge 2 $, then $ P $ is an irreducible stochastic matrix of period $ 2 $. Moreover,

$ \begin{eqnarray*} &&P^2 = (p^2(i, j)), p^2(1, 1) = p^2(1, 3) = 1/2, p^2(k, k) = \frac1k+\frac{1}{k+1}, \\ &&p^2(k, k+2) = \frac{1}{k(k+1)}, p^2(k, k-2) = \frac{k-2}{k} \end{eqnarray*} $

for $ k\ge 2 $.

$ C_0 = \{1, 3, \cdots\}, C_1 = \{2, 4, \cdots\}, T_0 = (t_0(i, j)), T_1 = (t_1(i, j)), $

where

$ \begin{eqnarray*} &&t_0(1, 1) = t_0(1, 3) = 1/2, \; t_0(2k+1, 2k+1) = \frac{1}{2k+1}+\frac{1}{2k+2}, \\ &&t_0(2k+1, 2k+3) = \frac{1}{(2k+1)(2k+2)}, \; t_0(2k+1, 2k-1) = \frac{2k-1}{2k+1}, \\ &&t_1(2k, 2k) = \frac{1}{2k}+\frac{1}{2k+1}, \; t_1(2k, 2(k+2)) = \frac{1}{2k(2k+1)}, \; t_1(2k, 2k-2) = \frac{k-1}{k} \end{eqnarray*} $

for $ k\ge 1 $. The solution of $ \pi P = \pi $ and $ \sum\limits_{i}\pi(i) = 1 $ are

$ \pi(1) = \frac{1}{1+2+\sum\limits_{n = 3}^{\infty}\sum\limits_{k = 1}^{n-1}\frac{k+1}{2\cdot k!}}, \pi(2) = 2\pi(1), \pi(n) = \sum\limits_{k = 1}^{n-1}\frac{k+1}{2\cdot k!}\pi(1) $

for $ n\ge 3 $.

Theorem 1.1   Suppose $ \{X_n, n\ge0\} $ is a countable nonhomogeneous Markov chain taking values in $ S = \{1, 2, \cdots\} $ with initial distribution of (1.1) and transition matrices of (1.2). Assume that $ f $ is a real function satisfying $ |f(x)|\le M $ for all $ x\in \mathbb{R} $. Suppose that $ P $ is a periodic strongly ergodic stochastic matrix. Assume that $ R $ is a constant stochastic matrix each row of which is the left eigenvector $ \pi = (\pi(1), \pi(2), \cdots) $ of $ P $ satisfying $ \pi P = \pi $ and $ \sum\limits_{i}\pi(i) = 1 $. Assume that

$ \begin{equation} \mathop {\lim }\limits_{n \to \infty } \mathop {\sup }\limits_{m \ge 0}\frac1n\sum\limits_{k = 1}^{n}\|P_{k+m}-P\| = 0 \end{equation} $ (1.4)

and

$ \begin{equation} \theta = \sum\limits_{i\in S}\pi(i)[f^2(i)-(\sum\limits_{j\in S}f(j)p(i, j))^2]>0. \end{equation} $ (1.5)

Moreover, if the sequence of $ \delta $-coefficient satisfies

$ \mathop {\lim }\limits_{n \to \infty } \frac{{\sum\limits_{k = 1}^n \delta ({P_k})}}{{\sqrt n }} = 0, $ (1.6)

then we have

$ \begin{equation} \frac{S_n-E(S_n)}{\sqrt{n\theta}}\stackrel{D}{\Rightarrow}N(0, 1), \end{equation} $ (1.7)

where $ S_n = \sum\limits_{k = 1}^{n}f(X_k) $, $ \stackrel{D}{\Rightarrow} $ stands for the convergence in distribution.

Theorem 1.2   Under the hypotheses of Theorem 1.1, if moreover

$ \mathop {\lim }\limits_{n \to \infty } \frac{{a(n)}}{{\sqrt n }} = \infty , \mathop {\lim }\limits_{n \to \infty } \frac{{a(n)}}{n} = 0, $ (1.8)

then for each open set $ G\subset \mathbb{R}^1 $,

$ \mathop {\lim }\limits_{n \to \infty } \frac{n}{a^2(n)}\log \mathbb{P} \left\{\frac{S_n-E(S_n)}{\sqrt{a(n)}}\in G\right\}\ge -\inf\limits_{x\in G}I(x), $

and for each closed set $ F\subset \mathbb{R}^1 $,

$ \mathop {\lim }\limits_{n \to \infty } \frac{n}{a^2(n)}\log \mathbb{P} \left\{\frac{S_n-E(S_n)}{\sqrt{a(n)}}\in F\right\}\le -\inf\limits_{x\in F}I(x), $

where $ I(x): = \frac{x^2}{2\theta} $.

In Sections 2 and 3, we prove Theorems 1.1 and 1.2. The ideas of proofs of Theorem 1.1 come from Huang et al. [1] and Yang [5].

2 Proof of Theorem 1.1

Let

$ \begin{eqnarray} &&D_n = f(X_n)-E[f(X_n)|X_{n-1}], n\ge 1, \mbox{ } D_0 = 0, \end{eqnarray} $ (2.1)
$ \begin{eqnarray} &&W_n = \sum\limits_{k = 1}^{n}D_k. \end{eqnarray} $ (2.2)

Write $ {\mathcal F}_n = \sigma(X_k, 0\le k\le n) $. Then $ \{W_n, {\mathcal F}_n, n\ge1\} $ is a martingale, so that $ \{D_n, {\mathcal F}_n, n\ge 0\} $ is the related martingale difference. For $ n = 1, 2, \cdots $, set

$ V(W_n): = \sum\limits_{k = 1}^{n}E[D_k^2|{\mathcal F}_{k-1}] $

and

$ v(W_n): = E[V(W_n)]. $

It is clear that

$ v(W_n) = E[W_n^2] = E[V(W_n)]. $

As in Huang et al. [1], to prove Theorem 1.1, we first state the central limit theorem associated with the stochastic sequence of $ \{W_n\}_{n\ge 1} $, which is a key step to establish Theorem 1.1.

Lemma 2.1   Assume $ \{X_n, n\ge0\} $ is a countable nonhomogeneous Markov chain taking values in $ S = \{1, 2, \cdots\} $ with initial distribution of (1.1) and transition matrices of (1.2). Suppose $ f $ is a real function satisfying $ |f(x)|\le M $ for all $ x\in \mathbb{R} $. Assume that $ P $ is a periodic strongly ergodic stochastic matrix, and $ R $ is a constant stochastic matrix each row of which is the left eigenvector $ \pi = (\pi(1), \pi(2), \cdots) $ of $ P $ satisfying $ \pi P = \pi $ and $ \sum\limits_{i}\pi(i) = 1 $. Suppose that (1.4) and (1.5) are satisfied, and $ \{W_n, n\ge0\} $ is defined by (2.2). Then

$ \begin{equation} \frac{W_n}{\sqrt{n\theta}}\stackrel{D}{\Rightarrow}N(0, 1), \end{equation} $ (2.3)

where $ \stackrel{D}{\Rightarrow} $ stands for the convergence in distribution.

As in Huang et al. [1], to establish Lemma 2.1, we need two important statements below such as Lemma 2.2 (see Brown [9]) and Lemma 2.3 (see Yang [6]).

Lemma 2.2   Assume that $ (\Omega, {\mathcal F}, \mathbb{P}) $ is a probability space, and $ \{{\mathcal F}_n, n = 1, 2, \cdots\} $ is an increasing sequence of $ \sigma $-algebras. Suppose that $ \{M_n, {\mathcal F}_n, n = 1, 2, \cdots\} $ is a martingale, denote its related martingale difference by $ \xi_0 = 0 $, $ \xi_n = M_n-M_{n-1} $ $ (n = 1, 2, \cdots) $. For $ n = 1, 2, \cdots $, write

$ \begin{eqnarray*} V(M_n) = \sum\limits_{j = 1}^{n}E[\xi^2_j|{\mathcal F}_{j-1}], \; \; v(M_n) = E[V(M_n)], \end{eqnarray*} $

where $ {\mathcal F}_0 $ is the trivial $ \sigma $-algebra. Assume that the following holds

(i)

$ \begin{equation} \frac{V(M_n)}{v(M_n)}\stackrel{P}{\Rightarrow}1, \end{equation} $ (2.4)

(ii) the Lindeberg condition holds, i.e., for any $ \epsilon>0 $,

$ \mathop {\lim }\limits_{n \to \infty } \frac{\sum\limits_{j = 1}^{n}E[\xi^2_jI(|\xi_j|\ge \epsilon\sqrt{v(M_n)})]}{v(M_n)} = 0, $

where $ I(\cdot) $ denotes the indicator function. Then we have

$ \begin{equation} \frac{V(M_n)}{\sqrt{v(M_n)}}\stackrel{D}{\Rightarrow}N(0, 1), \end{equation} $ (2.5)

where $ \stackrel{P}{\Rightarrow} $ and $ \stackrel{D}{\Rightarrow} $ denote convergence in probability and in distribution respectively.

Write $ \delta_i(j) = \delta_{ij} $, $ (i, j\in S) $. Set

$ L_n(i) = \sum\limits_{k = 0}^{n-1}\delta_i(X_k). $

Lemma 2.3   Assume that $ \{X_n, n\ge 0\} $ is a countable nonhomogeneous Markov chain taking values in $ S = \{1, 2, \cdots\} $ with initial distribution $ (1.1) $, and transition matrices $ (1.2) $. Suppose that $ P $ is a periodic strongly ergodic stochastic matrix, and $ R $ is matrix each row of which is the left eigenvector $ \pi = (\pi(1), \pi(2), \cdots) $ of $ P $ satisfying $ \pi P = \pi $ and $ \sum\limits_{i}\pi(i) = 1 $. Assume (1.4) holds. Then

$ \begin{equation} \mathop {\lim }\limits_{n \to \infty } \frac1n L_n(i) = \pi(i)\; \; \mbox{ a.e.}. \end{equation} $ (2.6)

Now let's come to establish Lemma 2.1.

Proof of Lemma 2.1   Applications of properties of the conditional expectation and Markov chains yield

$ \begin{equation} \begin{aligned} &\frac{V(W_n)}{n} = \frac1n\sum\limits_{k = 1}^{n}E[D_k^2|{\mathcal F}_{k-1}]\\ = &\frac1n\sum\limits_{k = 1}^{n}\{E[f^2(X_k)|X_{k-1}]-[E[f(X_k)|X_{k-1}]]^2\}: = I_1(n)-I_2(n), \end{aligned} \end{equation} $ (2.7)

where

$ \begin{equation} \begin{aligned} I_1(n) = \frac1n\sum\limits_{k = 1}^{n}E[f^2(X_k)|X_{k-1}] = \sum\limits_{j\in S}\sum\limits_{i\in S}f^2(j)\frac1n\sum\limits_{k = 1}^{n}p_k(i, j)\delta_i(X_{k-1}) \end{aligned} \end{equation} $ (2.8)

and

$ \begin{equation} \begin{aligned} I_2(n) = \frac1n\sum\limits_{k = 1}^{n}[E[f(X_k)|X_{k-1}]]^2 = \sum\limits_{i\in S}\sum\limits_{j, \ell\in S}f(j)f(\ell)\frac1n\sum\limits_{k = 1}^{n}p_k(i, j)p_k(i, \ell)\delta_i(X_{k-1}). \end{aligned} \end{equation} $ (2.9)

We first use (1.4) and Fubini's theorem to obtain

$ \begin{equation} \begin{aligned}&\mathop {\lim }\limits_{n \to \infty } \sum\limits_{i\in S}\frac1n\sum\limits_{k = 1}^{n}\sum\limits_{j\in S}\delta_i(X_{k-1})|p_k(i, j)-p(i, j)| \le \mathop {\lim }\limits_{n \to \infty } \frac1n \sum\limits_{i\in S}\sum\limits_{k = 1}^{n}\delta_i(X_{k-1})\|P_k-P\|\\ = &\mathop {\lim }\limits_{n \to \infty } \frac1n\sum\limits_{k = 1}^{n} \sum\limits_{i\in S}\delta_i(X_{k-1})\|P_k-P\| \le \mathop {\lim }\limits_{n \to \infty } \frac1n\sum\limits_{k = 1}^{n} \|P_k-P\| = 0. \end{aligned} \end{equation} $ (2.10)

Hence, it follows from (2.10) and $ \pi P = \pi $ that

$ \begin{equation} \begin{aligned} &\mathop {\lim }\limits_{n \to \infty } I_1(n) = \mathop {\lim }\limits_{n \to \infty } \sum\limits_{j\in S}\sum\limits_{i\in S}f^2(j)\frac1n\sum\limits_{k = 1}^{n}p(i, j)\delta_i(X_{k-1})\\ = &\sum\limits_{j\in S}\sum\limits_{i\in S}f^2(j)p(i, j)\pi(i) = \sum\limits_{j\in S}f^2(j)\pi(j)\; \; \mbox{a.e.}. \end{aligned} \end{equation} $ (2.11)

We next claim that

$ \begin{equation} \mathop {\lim }\limits_{n \to \infty } I_2(n) = \sum\limits_{i\in S}\pi(i)\left[\sum\limits_{j\in S}f(j)p(i, j)\right]^2\; \; \mbox{a.e.}. \end{equation} $ (2.12)

Indeed, we use (1.4) and (2.9) to have

$ \begin{aligned} &\left|I_2(n)-\sum\limits_{i\in S}\sum\limits_{j, \ell\in S}f(j)f(\ell)\frac1n\sum\limits_{k = 1}^{n}p(i, j)p(i, \ell)\delta_i(X_{k-1})\right|\\ \le& \left|\sum\limits_{i\in S}\sum\limits_{j, \ell\in S}f(j)f(\ell)\frac1n\delta_i(X_{k-1})\sum\limits_{k = 1}^{n}(p_k(i, j)-p(i, j))p_k(i, \ell)\right.\\ &\left.+\sum\limits_{i\in S}\sum\limits_{j, \ell\in S}f(j)f(\ell)\frac1n\delta_i(X_{k-1})\sum\limits_{k = 1}^{n}p(i, j)(p_k(i, \ell)-p(i, \ell))\right|\\ \le& M^2\left(\frac1n\sum\limits_{k = 1}^{n}\sum\limits_{i\in S}\delta_i(X_{k-1})\|P_k-P\|+\frac1n\sum\limits_{k = 1}^{n}\sum\limits_{i\in S}\delta_i(X_{k-1})\|P_k-P\|\right)\\ \le& 2M^2\frac1n\sum\limits_{k = 1}^{n}\|P_k-P\|\rightarrow0, \mbox{ as }n\rightarrow\infty. \end{aligned} $

Thus we use Lemma 2.3 again to obtain

$ \begin{aligned} \mathop {\lim }\limits_{n \to \infty } I_2(n)& = \sum\limits_{i\in S}\sum\limits_{j, \ell\in S}f(j)f(\ell)p(i, j)p(i, \ell)\mathop {\lim }\limits_{n \to \infty } \frac1n\sum\limits_{k = 1}^{n}\delta_i(X_{k-1})\\ & = \sum\limits_{i\in S}\sum\limits_{j, \ell\in S}f(j)f(\ell)p(i, j)p(i, \ell)\pi(i)\\ & = \sum\limits_{i\in S}\pi(i)[\sum\limits_{j\in S}f(j)p(i, j)]^2\; \; \mbox{a.e..} \end{aligned} $

Therefore (2.12) holds. Combining (2.11) and (2.12) results in

$ \begin{equation} \mathop {\lim }\limits_{n \to \infty } \frac{V(W_n)}{n} = \sum\limits_{i\in S}\pi(i)[f^2(i)-(\sum\limits_{j\in S}f(j)p(i, j))^2] \;\;{\rm{a}}{\rm{.e}}{\rm{., }} \end{equation} $ (2.13)

which gives

$ \begin{equation} \mathop {\lim }\limits_{n \to \infty } \frac{V(W_n)}{n} = \sum\limits_{i\in S}\pi(i)[f^2(i)-(\sum\limits_{j\in S}f(j)p(i, j))^2] \mbox{ in probability.} \end{equation} $ (2.14)

Since $ \{V(W_n)/n, n\ge 1\} $ is uniformly bounded, $ \{V(W_n)/n, n\ge 1\} $ is uniformly integrable. By applying the above two facts, and (1.5), we have

$ \begin{equation} \mathop {\lim }\limits_{n \to \infty } \frac{E[V(W_n)]}{n} = \sum\limits_{i\in S}\pi(i)[f^2(i)-(\sum\limits_{j\in S}f(j)p(i, j))^2] >0. \end{equation} $ (2.15)

Therefore we obtain

$ \frac{V(W_n)}{v(W_n)}\stackrel{P}{\Rightarrow}1. $

Also note that $ \{D^2_n = [f(X_n)-E[f(X_n)|X_{n-1}]]^2\} $ is uiformly integrable. Thus

$ \mathop {\lim }\limits_{n \to \infty } \frac{\sum\limits_{j = 1}^{n}E[D^2_jI(|D_j|\ge\epsilon\sqrt{n} )]}{n} = 0, $

which implies that the Lindeberg condition holds. Application of Lemma 2.2 yields (2.3). This establishes Lemma 2.1.

Proof of Theorem 1.1   Note that

$ \begin{equation} S_n-E[S_n] = W_n+\sum\limits_{k = 1}^{n}[E[f(X_k)|X_{k-1}]-E[f(X_k)]]. \end{equation} $ (2.16)

Write

$ \mathbb{P}(X_k = j) = P_k(j), j\in S. $

Let's evaluate the upper bound of $ |E[f(X_k)|X_{k-1}]-E[f(X_k)]| $. In fact, we use the C-K formula of Markov chain to obtain

$ \begin{aligned} &|E[f(X_k)|X_{k-1}]-E[f(X_k)]| = \left|\sum\limits_{j\in S}f(j)P_k(j|X_{k-1})-\sum\limits_{j\in S}f(j)P_k(j)\right|\\ \le& \sup\limits_{i}\left|\sum\limits_{j\in S}f(j)\left[P_k(j|i)-\sum\limits_{s}P_{k-1}(s)P_k(j|s)\right]\right|\le M\sup\limits_{i}\sum\limits_{j\in S}\left|P_k(j|i)-\sum\limits_{s}P_{k-1}(s)P_k(j|s)\right|\\ = & M\sup\limits_{i}\sum\limits_{j\in S}\left|\sum\limits_{s}P_{k-1}(s)P_k(j|i)-\sum\limits_{s}P_{k-1}(s)P_k(j|s)\right|\\ \le& M\sup\limits_{i}\sum\limits_{s}P_{k-1}(s)\sup\limits_{s}\sum\limits_{j\in S}|P_k(j|i)-P_k(j|s)|\\ = & M\sup\limits_{i, s}\sum\limits_{j\in S}|P_k(j|i)-P_k(j|s)| = 2M\delta(P_k), \end{aligned} $

here

$ \delta(P_k) = \sup\limits_{i, s}\sum\limits_{j\in S}[P_k(j|i)-P_k(j|s)]^{+} = \frac12\sup\limits_{i, s}\sum\limits_{j\in S}|P_k(j|i)-P_k(j|s)|. $

Application of (1.6) yields

$ \begin{equation} \mathop {\lim }\limits_{n \to \infty } \frac{\sum\limits_{k = 1}^{n}[E[f(X_k)|X_{k-1}]-E[f(X_k)]]}{\sqrt{n}} = 0. \end{equation} $ (2.17)

Combining (1.6), (2.3), (2.16), and (2.17), results in (1.7). This proves Theorem 1.1.

3 Proof of Theorem 1.2

We use Gärtner-Ellis theorem, and exponential equivalence methods to prove Theorem 1.2. By applying Taylor's formula of $ e^{x} $, (1.5), (1.8), (2.15), Fubini's theorem, properties of conditional expectations and martingale, we claim that for any $ t\in \mathbb{R}^1 $,

$ \begin{align*} &\mathop {\lim }\limits_{n \to \infty } \frac{n}{a^2(n)}\log E\left[\exp\left[\frac{a(n)}{n}t W_n\right]\right]\\ = &\mathop {\lim }\limits_{n \to \infty } \frac{n}{a^2(n)}\log E\left[1+\left[\frac{a(n)}{n}t W_n\right]+\sum\limits_{k = 2}^{\infty}\left[\frac{a(n)}{n}t W_n\right]^k/k!\right]\\ = &\mathop {\lim }\limits_{n \to \infty } \frac{n}{a^2(n)}\log\left[1+\sum\limits_{k = 2}^{\infty}E\left[\frac{a(n)}{n}t W_n\right]^k/k!\right]\\ = &\mathop {\lim }\limits_{n \to \infty } \frac{n}{a^2(n)}\log\left[1+\sum\limits_{k = 2}^{\infty}\left[\left(\frac{a(n)}{n}t\right)^k \sum\limits_{1\le i_1, i_2, \cdots, i_k\le n}E[D_{i_1}D_{i_2}\cdots D_{i_k}]\right]/k!\right]\\ = &\mathop {\lim }\limits_{n \to \infty } \frac{n}{a^2(n)}\log\left[1+\sum\limits_{k = 2}^{\infty}\left[\left(\frac{a(n)}{n}t\right)^k \sum\limits_{1\le i\le n}E[D_i^k]\right]/k!\right]\\ = &\mathop {\lim }\limits_{n \to \infty } \frac{n}{a^2(n)}\log \left[1+\frac{a^2(n)t^2}{2n}\frac{ \sum\limits_{k = 1}^{n}E(D_k^2)}{n}+\sum\limits_{k = 3}^{\infty}\frac{a^k(n)}{n^kk!}t^k \sum\limits_{i = 1}^{n}E[D_i^k]\right]\\ = &\mathop {\lim }\limits_{n \to \infty } \frac{n}{a^2(n)}\log \left[1+\frac{a^2(n)t^2}{2n}\frac{ \sum\limits_{k = 1}^{n}E(D_k^2)}{n}+o(\frac{a^2(n)}{n})\right]\\ = &\frac{t^2\theta}{2}. \end{align*} $

In fact, by (1.8),

$ \begin{align*} &\mathop {\lim }\limits_{n \to \infty } \left|\sum\limits_{k = 3}^{\infty}\frac{a^k(n)}{n^kk!}t^k \sum\limits_{i = 1}^{n}E[D_i^k]/\left[\frac{a^2(n)}{n}\right]\right|\\ \le&\mathop {\lim }\limits_{n \to \infty } \left|\sum\limits_{k = 3}^{\infty}\frac{a^{k-2}(n)}{n^{k-1}}t^k \cdot n (2\sup\limits_{x\in S}|f(x)|)^k\right|\\ = &\mathop {\lim }\limits_{n \to \infty } \frac{a(n)(2t\sup\limits_{x\in S}|f(x)|)^3}{n}\frac1{1-a(n)t2\sup\limits_{x\in S}|f(x)|/n}\\ = &0, \end{align*} $

and the claim is proved. Hence, by using Gärtner-Ellis theorem, we deduce that $ W_n/a(n) $ satisfies the moderate deviation theorem with rate function $ I(x) = \frac{x^2}{2\theta} $. It follows from (1.8) and (2.17) that $ \forall \epsilon>0 $,

$ \begin{align*} &\mathop {\lim }\limits_{n \to \infty } \frac{n}{a^2(n)}\log \mathbb{P}\left(\left|\frac{S_n-E[S_n]}{a(n)}-\frac{W_n}{a(n)}\right|>\epsilon\right)\\ = &\mathop {\lim }\limits_{n \to \infty } \frac{n}{a^2(n)}\log \mathbb{P}\left(\left|\frac{\sum\limits_{k = 1}^{n}[E[f(X_k)|X_{k-1}]-E[f(X_k)]]}{a(n)}\right|>\epsilon\right)\\ = &0. \end{align*} $

Thus, by the exponential equivalent method (see Theorem 4.2.13 of Dembo and Zeitouni [10], Gao [11]), we see that $ \{\frac{S_n-E[S_n]}{a(n)}\} $ satisfies the same moderate deviation theorem as $ \{\frac{W_n}{a(n)}\} $ with rate function $ I(x) = \frac{x^2}{2\theta} $. This completes the proof.

References
[1]
Huang H L, Yang W G, Shi Z Y. The central limit theorem for nonhomogeneous Markov chains[J]. Chinese J. Appl. Prob. Stat., 2013, 29(4): 337-347.
[2]
Gao F Q. Moderately large deviations for uniformly ergodic markov processes, research announcements[J]. Adv. Math. (China), 1992, 21(3): 364-365.
[3]
de Acosta A. Moderate deviations for empirical measures of markov chain:lower bounds[J]. Ann. Prob., 1997, 25(1): 259-284.
[4]
de Acosta A, Chen X. Moderate deviations for empirical measures of markov chain:upper bounds[J]. J. The. Prob., 1998, 11(4): 1075-1110.
[5]
Yang W G. Convergence in the Cesàro sense ans strong law of large numbers for nonhomogeneous Markov chains[J]. Linear Alg. Appl., 2002, 354(1): 275-286.
[6]
Yang W G. Strong law of large numbers for nonhomogeneous Markov chains[J]. Linear Alg. Appl., 2009, 430(11-12): 3008-3018. DOI:10.1016/j.laa.2009.01.016
[7]
Hu D H. Countable Markov process theory[M]. Wuhan: Wuhan University Press (in Chinese), 1983.
[8]
Bowerman B, David H T and Isaacson D. The convergence of Cesàro averages for certain nonstationary Markov chains[J]. Stoch. Proc. Appl., 1977, 5(1): 221-230.
[9]
Brown B M. Martingale central limit theorems[J]. Ann. Math. Statist., 1971, 42(1): 59-66.
[10]
Dembo A, Zeitouni O. Large deviations techniques and applications[M]. New York: Springer, 1998.
[11]
Gao F Q. Moderate deviations for a nonparametric estimator for sample coverage[J]. Ann. Prob., 2013, 41(2): 641-669.
[12]
Zhang H Z, Hao R L, Ye Z X, Yang W G. Some strong limit properties for countable nonhomogeneous Markov chains (in Chinese)[J]. Chinese J. Appl. Prob. Stat., 2016, 32(1): 62-68.