数学杂志  2015, Vol. 35 Issue (2): 237-251   PDF    
扩展功能
加入收藏夹
复制引文信息
加入引用管理器
Email Alert
RSS
本文作者相关文章
LUO Kui
MA Xue-min
MA Zhi-wei
ZHOU Xuan
THE STATISTICS ANALYSIS OF RANDOMLY TRIMMED MEANS AND THEIR BOOTSTRAP
LUO Kui1, MA Xue-min2, MA Zhi-wei3, ZHOU Xuan1    
1. Industrial Training Centre, Shenzhen Polytechnic, Shenzhen 518055, China;
2. School of Mathematics and Statistics, Wuhan University, Wuhan 430072, China;
3. Center for Chinese Entrepreneur Studies, Tsinghua University, Beijing 100084, China
Abstract: In this paper, we consider studentized randomly trimmed means and their boot-strap under relaxed conditions. By using empirical process approach of CG, we obtain the stu-densized central limit theorem of randomly trimmed means whose asymptotic properties do not depend on the underlying density under relaxed conditions, which extend the results of randomly trimmed means by Chen and Gine[1].
Key words: trimmed means     central limit theorems     bootstrap    
随机切尾均值及其自举的统计分析
罗葵1, 马学敏2, 马志伟3, 周旋1    
1. 深圳职业技术学院工业中心, 广东 深圳 518055;
2. 武汉大学数学与统计学院, 湖北 武汉 430072;
3. 清华大学中国企业研究中心, 北京 100084
摘要:本文研究了松弛条件下学习随机切尾均值及其自举的统计性质.利用CG的经验过程方法, 本文证明了在松弛条件下, 学习随机切尾均值的中心极限定理, 其渐进性质不依赖密度函数.进一步, 得到了该性质对于学习切尾均值的自举依然成立, 推广了Chen和Gine[1]随机切尾均值的相关研究结果.
关键词切尾均值    中心极限定理    自举    
1 Introduction

It is well known that the sample mean is very sensitive to outliers in the data. As an alternative approach, randomly trimmed means have been studied widely in statistics. We can construct a randomly trimmed mean as follows: choose two levels $a_{n}<b_{n}$ depending on the sample and calculate the average of the data points between $a_{n}$ and $b_{n}$. Above levels can be obtained by evaluating empirical distribution $F_{n}$ of functions $a(F)$ and $b(F)$ where $F$ is the probability distribution of $X$. If $a(F)=F^{-1}(\alpha/2)$ and $b(F)=F^{-1}(1-\alpha/2)$, we can have the classical trimming. Let $a(F)=F^{-1}(\alpha/2)$ and $b(F)=F^{-1}(1-\alpha/2)$, $X$ is trimmed at level $\alpha$. Let $a(F)=\mu(F)-cs(F)$ and $b(F)=\mu(F)-cs(F)$ where $\mu$ denotes median, $s$ denotes median absolute deviation (MAD), and $c$ is a constant of choice, we can have trimmed mean of Hampel [2].

Trimmed means are robust and can be bootstrapped. Hall and Padmanahban [3] study the bootstrap for the studentized classical trimmed mean. Shorack [4] provides a systematic study of bootstrapping to various L-statistics. In a recent paper, Chen and Gine [1] (hence CG, 2004) present a unified, empirical process based approach to the central limit theorem and the bootstrap central limit theorem for the general trimmed mean under mild assumptions on the levels $a(F)$ and $b(F)$. Their basic assumption is that $\sqrt{n}(a_{n}-a)$ can be asymptotically linearized. It is asymptotically equivalent to $\sum\limits_{i=1}^{n}(h_{1}(X_{i})-Eh_{1}(X))/\sqrt{n}$ for some square integrable function $h_{1}$ for $F$. Similar conditions also hold for $\sqrt{n}(b_{n}-b)$.

Despite their generality, the central limit theorems of CG include a complicated expression for the asymptotic variance of trimmed means that depends on the underlying density of $X$. Following the empirical process approach of CG, we study studentized randomly trimmed means and their bootstrap under relaxed conditions. We can avoid the complex expression of variance in our central limit theorems. The results in this thesis may be used to obtain asymptotic confidence intervals and tests for trimmed means and also goes to the confidence intervals and tests to the usual means. We do not pursue that here.

This thesis is organized as follows. In Section 2, we introduce some definitions and major results of CG. Section 3 studies the asymptotic properties of studentized trimmed means. Section 4 verifies the validity of bootstrap. A simple example is introduced in Section 5.

2 Definitions and Existing Results

Before describing our findings, we will introduce some basic definitions in this section and will review some existing results.

Let $X, X_{1}, \cdots, X_{n}, \cdots$ be independent identically distributed real random variables with common probability law $P$ and, for each $n\in N$, let $P_{n}=\frac{1}{n}\sum\limits_{i=1}^{n}\delta_{X_{i}}$ be the empirical measure corresponding to the first $n$ observations $X_{1}, \cdots, X_{n}$. $F$ and $F_{n}$ denote the cumulative distribution functions associated to $P$ and $P_{n}(\omega)$ for all $n\in N$ and $\omega\in\Omega$. In addition, we assume $a<b$ and s.t. $F(b(P))-F(a(P)-)\neq 0$ and $a\leq b$. The trimmed mean of $P$ based on $a$ and $b$ is defined as

$\theta=\theta(P):=\frac{\int_{a(P)}^{b(P)}xdF(x)}{F(b(P))-F(a(P)-)}=E(X|X\in[a(P), b(P)]).$ (2.1)

For convenience, we denote $a(P)$ and $b(P)$ as $a$ and $b$ in the following. And $a_{n}$ and $b_{n}$ are for $a(P_{n})$ and $b(P_{n})$. Similarly as (2.1), we can also define the empirical trimmed mean based on a and b as follows:

$\theta_{n}=\theta(P_{n})=\frac{\sum\limits_{i=1}^{n}X_{i}I_{[a_{n}, b_{n}]}(X_{i})} {\sum\limits_{i=1}^{n}I_{[a_{n}, b_{n}]}(X_{i})}=\frac{\int_{a_{n}}^{b_{n}}xdF_{n}(x)}{F_{n}(b_{n})-F_{n}(a_{n}-)}.$ (2.2)

Given $X, X_{i}, i\in N$ with c.d.f. $F$ and density $f$. Let $P_{n}=\frac{1}{n}\sum\limits_{i=1}^{n}\delta_{X_{i}}$ be the empirical measure and $\nu_{n}=\sqrt{n}(P_{n}-P)$ the empirical process. Let $-\infty<a<b<\infty$ and let $a_{n}, b_{n}$ be random variables such that $-\infty<a_{n}\leq b_{n}<\infty$ ${\rm a.s.}$. Following CG, we assume the following conditions:

(D.1) The c.d.f. $F$ has a derivative $f$ on an open set containing $a$ and $b$ and $f$ is continuous there, hence, $f$ is uniformly continuous on a compact set $K$ whose interior contains $a$ and $b$.

(D.2) $f(a)+f(b)\neq 0$.

(L) There exist measurable, $P$-square integrable functions $h_{1}$ and $h_{2}$ such that

$\sqrt{n}(a_{n}-a)=\nu_{n}(h_{1})+o_{p}(1)\;\;{\rm and}\;\;\sqrt{n}(b_{n}-b)=\nu_{n}(h_{2})+o_{p}(1).$ (2.3)

Based on the above framework, CG proves the following lemma and theorem:

Lemma 1 Assume (D.1) and (L). Define, for $x\in R$,

$g_{1}(x)= I_{[a, b]}(x)+f(b)h_{2}(x)-f(a)h_{1}(x),$ (2.4)
$g_{2}(x) = xI_{[a, b]}(x)+b f(b)h_{2}(x)-a f(a)h_{1}(x),$ (2.5)

Then

$\sqrt{n}[\int_{a_{n}}^{b_{n}}dF_{n}(x)-\int_{a}^{b}dF(x)] = \nu_{n}(g_{1})+o_{p}(1),$ (2.6)
$\sqrt{n}[\int_{a_{n}}^{b_{n}}xdF_{n}-\int_{a}^{b}xdF(x)] = \nu_{n}^{b}(g_{2})+o_{p}(1).$ (2.7)

Theorem 1 Assume (D.1), (D.2) and (L), and set

$g(x):=\frac{1}{\int_{a}^{b}dF(t)}g_{2}(x)-\frac{\int_{a}^{b}tdF(t)}{(\int_{a}^{b}dF(t))^{2}}g_{1}(x), \;\;x\in R$

with $g_{1}$ and $g_{2}$ as defined in Lemma 1. Let $\theta_{n}$ be the trimmed mean based on $a_{n}$ and $b_{n}$ and let $\theta$ be its population counterpart, as defined in above. Then

$\sqrt{n}(\theta_{n}-\theta)\rightarrow\sqrt{{\rm Var}_{F}(g)}Z,$

in distribution, where $Z$ is standard normal.

Theorem 1 is a very general result. It includes many cases studied in previous literature like Hample's means (Hampel [2, 5]), Symmetrically trimmed means (Huber [6]) and Kim's metrically trimmed means (Kim [7]). However, the variance of randomly trimmed means, ${\rm Var}_{F}(g)$, in Theorem 1 has a very complicated from. Especially, it depends on the underlying density $f(a)$ and $f(b)$. With these constraints, Theorem 1 may be difficult to implement in some practice, for example, to construct tests of sample means. To overcome the above deficiency, we prove a studentized central limit theorem of randomly trimmed means whose asymptotic properties do not depend on the underlying density under relaxed conditions. Furthermore, the bootstrap is also valid in our case.

3 Asymptotics of Studentized Trimmed Means

In the similar spirit as Lemma 1, we prove the following lemma. It plays an important role in our proof of studentized central limit theorem in the next.

Lemma 2 Assume (D.1) and (L) hold. Define, for $x\in R$,

$g_{3}(x)=x^{2}I_{[a, b]}(x)+b^{2}f(b)h_{2}(x)-a^{2}f(a)h_{1}(x).$

Then

$\sqrt{n}(\int_{a_{n}}^{b_{n}}x^{2}dF_{n}-\int_{a}^{b}x^{2}dF)=\nu_{n}(g_{3})+o_{p}(1).$ (3.1)

Proof We can write the left side of (3.1) as follows:

$\sqrt{n}[\int_{a_{n}}^{b_{n}}x^{2}dF_{n}-\int_{a}^{b}x^{2}dF]=\sqrt{n}\int_{a_{n}}^{b_{n}}x^{2}d(F_{n}-F)+\sqrt{n}\int_{a_{n}}^{a} x^{2}dF+\sqrt{n}\int_{b}^{b_{n}} x^{2}dF.$

Since $x$ is on the compact set $K$, $x^{2}$ is bounded. Then we have that

$\|x^{2}I_{[c, d]}(x)\|\leq \parallel x^{2} \parallel_{K}:=\sup\limits_{x\in K}x^{2}$

is finite. Also, there is $\delta_{0}>0$ such that $[a-\delta_{0}, a+\delta_{0}]\cup[b-\delta_{0}, b+\delta_{0}]\in K$. Therefore, by the asymptotic equicontinuity of the empirical process, see Dudley [8], we have

$\lim\limits_{\delta\rightarrow0}\limsup\limits_{n} {\rm Pr}\{\sup\limits_{|c-a|\leq\delta, |d-b|\leq\delta}|\nu_{n}(x^{2}I_{[c, d]}(x))-\nu_{n}(x^{2}I_{[a, b]}(x))|\geq\varepsilon\}=0$

for all $\varepsilon>0$. Since

$\quad\Pr\{|\sqrt{n}\int_{a_{n}}^{b_{n}}x^{2}d(F_{n}-F)-\nu_{n}(x^{2}I_{[a, b]}(x))|\geq\varepsilon\}\\ \leq \Pr\{\sup\limits_{|c-a|\leq\delta, |d-b|\leq\delta}|\nu_{n}(x^{2}I_{[c, d]}(x))-\nu_{n}(x^{2}I_{[a, b]}(x))|\geq\varepsilon\} \\ \quad + \Pr\{|b_{n}-b|>\delta\}+Pr\{|a_{n}-a|>\delta\}, $

the equicontinuity condition and condition (L) give, upon taking limits first as $n\rightarrow\infty$ and then as $\delta\rightarrow 0$, \begin{equation*} \sqrt{n}\int _{a_{n}}^{b_{n}}x^{2}d(F_{n}-F)=\nu_{n}(x^{2}I_{[a, b]}(x))+o_{p}(1). \end{equation*} Given $ 0<\delta\leq\delta_{0}$ and $M_{n}, n\in N$, such that $M_{n}\rightarrow \infty$ and $M_{n}/\sqrt{n}\rightarrow 0$, set

$T_{n}:=\sup\{|f(b)-f(c)|:|b-c|\leq (\delta+M_{n})/\sqrt{n})\}, $

which tends to zero by uniform continuity of $f$ on $K$. Then, on the event where $|\nu_{n}(h_{2})-\sqrt{n}(b_{n}-b)|\leq \delta$ and $|\nu_{n}(h_{2})|\leq M_{n}$, we have

$\quad\mid\sqrt{n}\int_{b}^{b_{n}}x^{2}dF-b^{2}f(b)\nu_{n}(h_{2})\mid \\ = \mid\sqrt{n}\int_{b}^{b_{n}}x^{2}(f(x)-f(b)+f(b))dx-b^{2}f(b)\nu_{n}(h_{2})\mid \\ \leq \mid\sqrt{n}\int_{b}^{b_{n}}x^{2}|f(x)-f(b)|dx+\mid \sqrt{n} \int_{b}^{b_{n}}x^{2}f(b)dx-b^{2}f(b)\nu_{n}(h_{2})\mid \\ = \sqrt{n}\int_{b}^{b_{n}}x^{2}|f(x)-f(b)|dx+\mid \sqrt{n} \int_{b}^{b_{n}}x^{2}dx-b^{2}\nu_{n}(h_{2})\mid f(b) \\ \leq \sqrt{n}|b_{n}-b|(|b|+\frac{\delta+M_{n}}{\sqrt{n}})^{2}T_{n}+\mid\sqrt{n}\int_{b}^{b_{n}}x^{2}dx-\sqrt{n}b^{2}(b_{n}-b)\\ \quad +b^{2}[\sqrt{n}(b_{n}-b)-\nu_{n}(h_{2})]\mid f(b) \\ \leq \sqrt{n}|b_{n}-b|(|b|+\frac{\delta+M_{n}}{\sqrt{n}})^{2}T_{n}+(\mid\sqrt{n}\int_{b}^{b_{n}}(x^{2}-b^{2})dx\mid\\ \quad +b^{2}\mid \sqrt{n}(b_{n}-b)-\nu_{n}(h_{2})\mid)f(b) \\ =\sqrt{n}|b_{n}-b|(|b|+\frac{\delta+M_{n}}{\sqrt{n}})^{2}T_{n}+(\sqrt{n}\mid\frac{b_{n}^{3}-b^{3}}{3}-b^{2}(b_{n}-b)\mid\\ \quad +b^{2}\mid\sqrt{n}(b_{n}-b)-\nu_{n}(h_{2})\mid)f(b) \\ \leq\sqrt{n}|b_{n}-b|(|b|+\frac{\delta+M_{n}}{\sqrt{n}})^{2}T_{n}+(\sqrt{n}\mid(b_{n}-b)(b_{n}^{2}-b_{n}b+b^{2})-b^{2}(b_{n}-b)\mid \\ \quad + b^{2}\mid\sqrt{n}(b_{n}-b)-\nu_{n}(h_{2})\mid)f(b) \\ \leq\sqrt{n}|b_{n}-b|(|b|+\frac{\delta+M_{n}}{\sqrt{n}})^{2}T_{n}+(\sqrt{n}\mid b_{n}-b\mid\mid b_{n}^{2}+b_{n}b\mid \\ \quad +b^{2}\mid \sqrt{n}(b_{n}-b)-\nu_{n}(h_{2})\mid)f(b) \\ \leq\sqrt{n}|b_{n}-b|(|b|+\frac{\delta+M_{n}}{\sqrt{n}})^{2}T_{n}+(\sqrt{n}\mid b_{n}-b\mid \mid (\frac{\delta+M_{n}}{\sqrt{n}})^{2}+\mid b\mid (\frac{\delta+M_{n}}{\sqrt{n}})\mid \\ \quad + b^{2}\mid \sqrt{n}(b_{n}-b)-\nu_{n}(h_{2}) \mid)f(b) \\ = \sqrt{n}|b_{n}-b|b^{2}T_{n}+ \mid\sqrt{n}(b_{n}-b)-\nu_{n}(h_{2})\mid b^{2} f(b)+o_{p}(1). $

We get the last equation above since $\frac{\delta+M_{n}}{\sqrt{n}}\rightarrow 0$. Based on the above results, we have the following equation, for any $\varepsilon>0$:

$ \quad \Pr\{\mid\sqrt{n}\int_{b}^{b_{n}}x^{2}dF-b^{2}f(b)\nu_{n}(h_{2})\mid>\varepsilon\}\\ \leq \Pr\{\sqrt{n}|b_{n}-b|b^{2}T_{n}+\mid\sqrt{n}(b_{n}-b)-\nu_{n}(h_{2})\mid b^{2} f(b)>\varepsilon\}\\ \quad + \Pr\{|\nu_{n}(h_{2})-\sqrt{n}(b_{n}-b)|>\delta\}+\Pr\{|\nu_{n}(h_{2})|>M_{n}\}\rightarrow 0. $

Then we have

$\sqrt{n}\int_{b}^{b_{n}}x^{2}dF=b^{2}f(b)\nu_{n}(h_{2})+o_{p}(1).$

Similarly, we can prove

$\sqrt{n}\int_{a_{n}}^{a}x^{2}dF=-a^{2}f(a)\nu_{n}(h_{1})+o_{p}(1).$

Then the lemma is proven.

Based on above lemma, we can prove the following result:

Lemma 3 Assume conditions (D.1), (D.2) and (L) hold. Denote

$S_{n}^{2}=\frac{\int_{a_{n}}^{b_{n}}(x-\theta_{n})^{2}dF_{n}}{\int_{a_{n}}^{b_{n}}dF_{n}} =\frac{\int_{a_{n}}^{b_{n}}x^{2}dF_{n}}{\int_{a_{n}}^{b_{n}}dF_{n}}-(\theta_{n})^{2},$

where $\theta_{n}=\frac{\int_{a_{n}}^{b_{n}}xdF_{n}}{\int_{a_{n}}^{b_{n}}dF_{n}}$. Then we have $S_{n}^{2}=S^{2}+o_{p}(1)$, where

$S^{2}=\frac{\int_{a}^{b}(t-\theta)^{2}dF}{\int_{a}^{b}dF}=\frac{\int_{a}^{b}t^{2}dF}{\int_{a}^{b}dF}-\frac{(\int_{a}^{b}tdF)^{2}}{(\int_{a}^{b}dF)^{2}}.$

Proof

$\begin{aligned} S_{n}^{2}& =\frac{\int_{a_{n}}^{b_{n}}x^{2}dF_{n}}{\int_{a_{n}}^{b_{n}}dF_{n}}-\frac{\int_{a_{n}}^{b_{n}}x dF_{n}}{\int_{a_{n}}^{b_{n}}dF_{n}}\frac{\int_{a_{n}}^{b_{n}}x dF_{n}}{\int_{a_{n}}^{b_{n}}dF_{n}}\\&= \frac{\int_{a}^{b}dF}{\int_{a_{n}}^{b_{n}}dF_{n}}\frac{\int_{a_{n}}^{b_{n}}x^{2}dF_{n}}{\int_{a}^{b}dF}- \frac{\int_{a}^{b}dF}{\int_{a_{n}}^{b_{n}}dF_{n}}\frac{\int_{a_{n}}^{b_{n}}x dF_{n}}{\int_{a}^{b}dF}\frac{\int_{a_{n}}^{b_{n}}x dF_{n}}{\int_{a}^{b}dF_{n}}\frac{\int_{a}^{b}dF}{\int_{a_{n}}^{b_{n}}dF_{n}}. \end{aligned}$

Since $\sqrt{n}[\int_{a_{n}}^{b_{n}}dF_{n}-\int_{a}^{b}dF]=\nu_{n}(g_{1})+o_{p}(1)$,

$\mid \frac{\int_{a_{n}}^{b_{n}}dF_{n}}{\int_{a}^{b}dF}-1\mid=\frac{1}{\int_{a}^{b}dF}\mid \int_{a_{n}}^{b_{n}}dF_{n}-\int_{a}^{b}dF\mid\rightarrow 0\;\;\text{in}\;\;\Pr.$

We can write $\frac{\int_{a}^{b}dF}{\int_{a_{n}}^{b_{n}}dF_{n}}=1+\tau_{n}$ where $\tau_{n}\rightarrow 0$ in probability. Then we have

$\begin{aligned} S_{n}^{2}&= \frac{\int_{a_{n}}^{b_{n}}x^{2}dF_{n}}{\int_{a}^{b}dF}(1+\tau_{n})-\frac{\int_{a_{n}}^{b_{n}}x dF_{n}}{\int_{a}^{b}dF}\frac{\int_{a_{n}}^{b_{n}}x dF_{n}}{\int_{a}^{b}dF}(1+\tau_{n})^{2} \\ &= (\frac{\int_{a_{n}}^{b_{n}}x^{2}dF_{n}}{\int_{a}^{b}dF}-\frac{\int_{a_{n}}^{b_{n}}x dF_{n}}{\int_{a}^{b}dF}\frac{\int_{a_{n}}^{b_{n}}x dF_{n}}{\int_{a}^{b}dF})+\frac{\int_{a_{n}}^{b_{n}}x^{2}dF_{n}}{\int_{a}^{b}dF}\tau_{n}\\& \quad -\frac{\int_{a_{n}}^{b_{n}}x dF_{n}}{\int_{a}^{b}dF}\frac{\int_{a_{n}}^{b_{n}}x dF_{n}}{\int_{a}^{b}dF}2\tau_{n}+\frac{\int_{a_{n}}^{b_{n}}x dF_{n}}{\int_{a}^{b}dF}\frac{\int_{a_{n}}^{b_{n}}x dF_{n}}{\int_{a}^{b}dF}\tau_{n}^{2}\\&=\frac{\int_{a_{n}}^{b_{n}}x^{2}dF_{n}}{\int_{a}^{b}dF}-\frac{\int_{a_{n}}^{b_{n}}x dF_{n}}{\int_{a}^{b}dF}\frac{\int_{a_{n}}^{b_{n}}x dF_{n}}{\int_{a}^{b}dF}+o_{p}(1). \end{aligned}$

Based on Lemma 1 and Lemma 2, we have

$\begin{aligned} S_{n}^{2}&= \frac{\int_{a_{n}}^{b_{n}}x^{2}dF_{n}}{\int_{a}^{b}dF}-\frac{\int_{a_{n}}^{b_{n}}x dF_{n}}{\int_{a}^{b}dF}\frac{\int_{a_{n}}^{b_{n}}x dF_{n}}{\int_{a}^{b}dF}+o_{p}(1) \\&= \frac{1}{\sqrt{n}}\frac{\sqrt{n}\int_{a}^{b}t^{2}dF+\nu_{n}(g_{3})}{\int_{a}^{b}dF}-\frac{1}{n}\frac{\sqrt{n}\int_{a}^{b}tdF+\nu_{n}(g_{2})}{\int_{a}^{b}dF}\frac{\sqrt{n}\int_{a}^{b}tdF+\nu_{n}(g_{2})}{\int_{a}^{b}dF}+o_{p}(1) \\ &= \frac{\int_{a}^{b}t^{2}dF}{\int_{a}^{b}dF}+\frac{1}{\sqrt{n}}\frac{\nu_{n}(g_{3})}{\int_{a}^{b}dF}-\frac{1}{n}\left[\frac{n(\int_{a}^{b}tdF)^{2}+2\sqrt{n}\int_{a}^{b}tdF\cdot\nu_{n}(g_{2})+\nu_{n}^{2}(g_{2})}{(\int_{a}^{b}dF)^{2}}\right]+o_{p}(1) \\ &= \frac{\int_{a}^{b}t^{2}dF}{\int_{a}^{b}dF}-\frac{(\int_{a}^{b}tdF)^{2}}{(\int_{a}^{b}dF)^{2}}+\frac{1}{\sqrt{n}}\frac{\nu_{n}(g_{3})}{\int_{a}^{b}dF}-\frac{2}{\sqrt{n}}\frac{\int_{a}^{b}tdF}{(\int_{a}^{b}dF)^{2}}\nu_{n}(g_{2})-\frac{1}{n}\frac{\nu_{n}^{2}(g_{2})}{(\int_{a}^{b}dF)^{2}}+o_{p}(1) \\ &= \frac{\int_{a}^{b}t^{2}dF}{\int_{a}^{b}dF}-\frac{(\int_{a}^{b}tdF)^{2}}{(\int_{a}^{b}dF)^{2}}+o_{p}(1)=S^2+o_{p}(1). \end{aligned}$

The lemma is proven.

Based on above lemmas, we can obtain the following theorem.

Theorem 2 Assume conditions (D.1), (D.2) and (LS) hold:

(LS) $\sqrt{n}(a_{n}-a)=o_{p}(1)\;\;\text{and}\;\;\sqrt{n}(b_{n}-b)=o_{p}(1)$.

Define $\sigma_{n}^{2}=\frac{S_{n}^{2}}{\int_{a_{n}}^{b_{n}}dF_{n}}=\frac{\int_{a_{n}}^{b_{n}}(x-\theta_{n})^{2}dF_{n}}{(\int_{a_{n}}^{b_{n}}dF_{n})^{2}}$, we have the studentized trimmed mean $\frac{\sqrt{n}(\theta_{n}-\theta)}{\sigma_{n}}\rightarrow N(0, 1)$ in distribution, where $\theta$ is the population randomly trimmed mean $\theta=\frac{\int_{a}^{b}tdF}{\int_{a}^{b}dF}$.

Proof According to conditions (L) and (LS), $h_{1}=h_{2}=0$. We have

$g_{1}(x)= I_{[a, b]}(x)+f(b)h_{2}(x)-f(a)h_{1}(x)=I_{[a, b]}(x)$

and

$g_{2}(x) = xI_{[a, b]}(x)+b f(b)h_{2}(x)-a f(a)h_{1}(x)=xI_{[a, b]}(x).$

Then

$g(x):=\frac{1}{\int_{a}^{b}dF(t)}g_{2}(x)-\frac{\int_{a}^{b}tdF(t)}{(\int_{a}^{b}dF(t))^{2}}g_{1}(x)=\frac{xI_{[a, b]}(x)}{\int_{a}^{b}dF(t)}-\frac{\int_{a}^{b}tdF(t)}{(\int_{a}^{b}dF(t))^{2}}I_{[a, b]}(x).$ (3.2)

Since $E_{F}(g)=\frac{\int_{a}^{b}tdF(t)}{\int_{a}^{b}dF(t)}-\frac{\int_{a}^{b}tdF(t)}{\int_{a}^{b}dF(t)}=0$, we have

$\begin{aligned} {\rm Var}_{F}(g)&= E_{F}(g^{2})-(E_{F}(g))^{2}\\ &= \frac{\int_{a}^{b}t^{2}dF(t)}{(\int_{a}^{b}dF(t))^{2}}-2\frac{\int_{a}^{b}tdF(t)}{(\int_{a}^{b}dF(t))^{3}}\int_{a}^{b}tdF(t)+\frac{(\int_{a}^{b}tdF(t))^{2}}{(\int_{a}^{b}dF(t))^{4}}\int_{a}^{b}dF(t)\\ &= \frac{\int_{a}^{b}t^{2}dF(t)}{(\int_{a}^{b}dF(t))^{2}}-\frac{(\int_{a}^{b}tdF(t))^{2}}{(\int_{a}^{b}dF(t))^{3}} \\ &=\frac{S^{2}}{\int_{a}^{b}dF(t)}. \end{aligned}$

Since $S_{n}^{2}=S^{2}+o_{p}(1)$,

$ {\rm Var}_{F}(g)=\frac{S_{n}^{2}}{\int_{a}^{b}dF(t)}+o_{p}(1)=\frac{S_{n}^{2}}{\int_{a_{n}}^{b_{n}}dF_{n}(t)}\frac{\int_{a_{n}}^{b_{n}}dF_{n}(t)}{\int_{a}^{b}dF(t)}+o_{p}(1)\\ = \sigma_{n}^{2}(1+o_{p}(1))+o_{p}(1) = \sigma_{n}^{2}+o_{p}(1). $

According to Theorem 1, we have $\frac{\sqrt{n}(\theta_{n}-\theta)}{\sigma_{n}}\rightarrow N(0, 1)$ in distribution.

4 Bootstrap of Studentized Trimmed Means

In the above section, we study the asymptotic properties of studentized trimmed means. But with a small sample data, the estimators and tests based on those asymptotic properties may not be accurate. In this case, bootstrap may provide some efficient improvement. In the case of the mean, Hall [9] proved that if $E x^{3}<\infty$ then the bootstrap approximation is better than the normal. Although we do not know if this is the case for the trimmed mean, it is interesting to know that the studentized CLT in the trimmed mean can be bootstrapped.

With the sample $X_{1}, \cdots, X_{n}$, we take $n$ samples from it with replacement. We denote $X_{n, 1}^{b}, \cdots, X_{n, n}^{b}$ as the $n$th bootstrap sample. Following the notion of CG, we write $F_{n}^{b}, P_{n}^{b}$ and $\nu_{n}^{b}$ respectively the empirical c.d.f, the empirical measure and the empirical process based on this sample: $P_{n}^{b}(A)=\frac{1}{n}\sum\limits_{i=1}^{n}\delta_{X_{n, i}^{b}}(A)$, $F_{n}^{b}(x)=P_{n}^{b}(-\infty, x]$ and $\nu_{n}^{b}=\sqrt{n}(p_{n}^{b}-p_{n})$. In addition, we denote by $\Pr_{b}=\Pr_{b}(\omega)$ the conditional probability given the sample $X_{1}, \cdots, X_{n}$ (its dependence on $\omega$ will not be displayed for convenience). Also, $L^{b}$ will denote conditional law given the sample. The symbol $o_{p_{b}}(1)$ means the following: $V_{n}(X_{n, 1}^{b}, X_{n, n}^{b}, X_{1}, \cdots, X_{n})$ is $o_{p_{b}}(1)$ a.s if a.s. $\Pr_{b}\{|V_{n}|>\varepsilon\}\rightarrow 0$ for every $\varepsilon>0$. This is also equivalent to: in almost every $\omega$, every subsequence of $V_{n}(\omega, \omega')$ has a subsequence that converges $\omega'-{\rm a.s}.$.

(D.3) $P$ has a density $f$ on $R$, the set $B_{f}=\{f>0\}$ is open and $f$ is continuous on $B_{f}$.

CG proves a central limit theorem which can be considered as the bootstrap version of Theorem 1. Under the similar conditions, the bootstrap is also valid for the studientized trimmed means CLT in Theorem 3. To show this result, we need prove the following lemma at first.

Lemma 4 Assume (D.2), (D.3), (L), $a, b\in B_{f}$, $a_{n}\rightarrow a$ ${\rm a.s.}$ and $b_{n}\rightarrow b$ ${\rm a.s.}$. Assume also that $a_{n}^{b}$ and $b_{n}^{b}$, defined respectively as $a_{n}^{b}=a(F_{n}^{b})$ and $a_{n}^{b}=a(F_{n}^{b})$, satisfy

$\sqrt{n}(a_{n}^{b}-a_{n})=\nu_{n}^{b}(h_{1})+o_{P_{b}}(1)\;\;\text{and}\;\;\sqrt{n}(b_{n}^{b}-b_{n})=\nu_{n}^{b}(h_{2})+o_{P_{b}}(1)\;\; {\rm a.s.}.$

Define, for $x\in R$,

$ g_{1}(x)= I_{[a, b]}(x)+f(b)h_{2}(x)-f(a)h_{1}(x), \\ g_{2}(x) = xI_{[a, b]}(x)+b f(b)h_{2}(x)-a f(a)h_{1}(x), \\ g_{3}(x) = x^{2}I_{[a, b]}(x)+b^{2} f(b)h_{2}(x)-a^{2} f(a)h_{1}(x). $

We have

$\sqrt{n}(\int_{a_{n}^{b}}^{b_{n}^{b}}dF_{n}^{b}-\int_{a_{n}}^{b_{n}}dF_{n}) = \nu_{n}^{b}(g_{1}) +o_{p^{b}}(1)\;\;{\rm a.s.},$ (4.1)
$\sqrt{n}(\int_{a_{n}^{b}}^{b_{n}^{b}}xdF_{n}^{b}-\int_{a_{n}}^{b_{n}}xdF_{n}) = \nu_{n}^{b}(g_{2}) +o_{p^{b}}(1)\;\;{\rm a.s.},$ (4.2)
$\sqrt{n}(\int_{a_{n}^{b}}^{b_{n}^{b}}x^{2}dF_{n}^{b}-\int_{a_{n}}^{b_{n}}x^{2}dF_{n}) = \nu_{n}^{b}(g_{3}) +o_{p^{b}}(1)\;\; {\rm {\rm a.s.}}.$ (4.3)

Proof We provide the complete proof for (4.1) in the following. (4.2) and (4.3) can be proven in the similar spirit. Let us denote

$\sqrt{n}(\int_{a_{n}^{b}}^{b_{n}^{b}}dF_{n}^{b}-\int_{a_{n}}^{b_{n}}dF_{n})=\sqrt{n}(\int_{a_{n}}^{b_{n}}d(F_{n}^{b}-F_{n})+\int_{a_{n}^{b}}^{a_{n}}dF_{n}^{b} +\int_{b_{n}}^{b_{n}^{b}}dF_{n}^{b}).$ (4.4)

According to Gine [10], the class of functions

$\mathcal{F}=\{I_{[a, b]}(x), xI_{[a, b]}(x), x^{2}I_{[a, b]}(x): -\infty<c_{1}\leq a<b\leq c_{2}<\infty\}$

is uniform P-Donsker. We have

$\{\nu_{n}^{b}(f): f\in\mathcal{F}\}\rightarrow _{L^{b}}\{G_{p}(f): f\in\mathcal{F}\} \; {\rm a.s., }$

where $\rightarrow _{L^{b}}$ denotes convergence in law conditionally on the sample and $G_{p}$ is a centered Gaussian process. Then $\nu_{n}^{b}(I_{[a, b]}(x))\rightarrow _{L^{b}}G_{p}(I_{[a, b]}(x))$.

Since

$\sqrt{n}\int_{a_{n}}^{b_{n}}d(F_{n}^{b}-F_{n})=\nu_{n}^{b}(I_{[a_{n}, b_{n}]}(x))=\nu_{n}^{b}(I_{[a_{n}, b_{n}]}(x))-\nu_{n}^{b}(I_{[a, b]}(x))+\nu_{n}^{b}(I_{[a, b]}(x)).$

To prove $\sqrt{n}\int_{a_{n}}^{b_{n}}d(F_{n}^{b}-F_{n})\rightarrow_{L^{b}}G_{p}(I_{[a, b]}(x))\;\; {\rm a.s.}$, we only need show

$\nu_{n}^{b}(I_{[a_{n}, b_{n}]}(x))-\nu_{n}^{b}(I_{[a, b]}(x))\rightarrow 0\;\;\;\;\text{in}\;\;p^{b}\;\; {\rm a.s.}.$ (4.5)

According to Gine [10],

$\lim\limits_{\delta\rightarrow 0}\lim\limits_{n\rightarrow\infty}\sup Pr_{b}\{\sup\limits_{g, h\in\mathcal{F}, E_{p}(g-h)^{2}\leq\delta}\mid \nu_{n}^{b}(g)-\nu_{n}^{b}(h)\mid>\varepsilon\}=0\;\;\forall \varepsilon>0\;\; {\rm a.s.},$ (4.6)

where the class of functions $\mathcal{F}$ is uniform $P$-Donsker. In addition,

$\lim\limits_{n\rightarrow\infty}E_{p}(I_{[a_{n}, b_{n}]}(x)-I_{[a, b]}(x))^{2}=0.$ (4.7)

Based on (4.6) and (4.7), we have

$\lim\limits_{n\rightarrow\infty}Pr_{b}\left\{\mid\nu_{n}^{b}(I_{[a_{n}, b_{n}]}(x))-\nu_{n}^{b}(I_{[a, b]}(x))\mid>\varepsilon\right\}=0\;\;\forall\;\;\varepsilon>0\;\;{\rm a.s.}.$

Then (4.5) is proven.

Next, we will prove

$\sqrt{n}\int_{b_{n}}^{b_{n}^{b}}dF_{n}^{b}-f(b)\nu_{n}^{b}(h_{2})=o_{p^{b}}(1)\;\;{\rm a.s.}.$ (4.8)

We can write

$\quad\mid \sqrt{n}\int_{b_{n}}^{b_{n}^{b}}dF_{n}^{b}-f(b)\nu_{n}^{b}(h_{2})\mid \\ = \mid \sqrt{n}\int_{b_{n}}^{b_{n}^{b}}dF_{n}^{b}-\sqrt{n}\int_{b_{n}}^{b_{n}^{b}}dF_{n}+\sqrt{n}\int_{b_{n}}^{b_{n}^{b}}dF_{n}-\sqrt{n}\int_{b_{n}}^{b_{n}^{b}}dF+\sqrt{n}\int_{b_{n}}^{b_{n}^{b}}dF-f(b)\nu_{n}^{b}(h_{2})\mid \\ \leq \mid \sqrt{n}\int_{b_{n}}^{b_{n}^{b}}dF_{n}^{b}-\sqrt{n}\int_{b_{n}}^{b_{n}^{b}}dF_{n}\mid+\mid \sqrt{n}\int_{b_{n}}^{b_{n}^{b}}dF_{n}-\sqrt{n}\int_{b_{n}}^{b_{n}^{b}}dF\mid\\ \quad +\mid \sqrt{n}\int_{b_{n}}^{b_{n}^{b}}dF-f(b)\nu_{n}^{b}(h_{2})\mid \\ = \gamma_{1}+\gamma_{2}+\gamma_{3}. $

In the following, we will prove the convergence of above three terms separately. At first, we can write

$ \begin{aligned} \gamma_{1}&= \mid \nu_{n}^{b}(-\infty, b_{n}^{b})-\nu_{n}^{b}(-\infty, b_{n})\mid \\& \leq \sup\limits_{|b_{n}-\lambda|<\delta}\mid \nu_{n}^{b}(-\infty, \lambda)-\nu_{n}^{b}(-\infty, b_{n})\mid\;\;\text{if}\;\;|b_{n}^{b}-b_{n}|<\delta. \end{aligned}$

Then given $\varepsilon$,

$\begin{aligned} \Pr_{b}\{\mid \sqrt{n}\int_{b_{n}}^{b_{n}^{b}}dF_{n}^{b}-\sqrt{n}\int_{b_{n}}^{b_{n}^{b}}dF_{n}\mid>\varepsilon\}&\leq \Pr_{b}\{\sup\limits_{|s-t|<\delta}\mid \nu_{n}^{b}(-\infty, s]-\nu_{n}^{b}(-\infty, t]\mid>\varepsilon\} \\&\quad + \Pr_{b}\{|b_{n}^{b}-b_{n}|>\delta\}\;\;\text{for all}\;\;\delta.\end{aligned}$ (4.9)

According to conditions in the lemma, we know that $\sqrt{n}(b_{n}^{b}-b_{n})-v_{n}^{b}(h_{2})\rightarrow 0$ in $\Pr_{b}$ ${\rm a.s.}$ and we also know $\nu_{n}^{b}(h_{2})\rightarrow_{d^{b}}N(0, {\rm Var}_{p}(h_{2}))$. So, we have

$\Pr\limits_{b}(\sqrt{n}|b_{n}^{b}-b_{n}|>M_{n})\rightarrow 0\;\;\;{\rm a.s.}\;\; \text{for any} \;\;M_{n}\rightarrow\infty.$

Let $M_{n}=\frac{\sqrt{n}}{(\lg \lg n)^{2}}\rightarrow \infty$, $\Pr_{b}\left(\sqrt{n}|b_{n}^{b}-b_{n}|>\frac{\sqrt{n}}{(\lg\lg n)^{2}}\right)\rightarrow 0$ a.s.. Then

$\Pr\limits_{b}\left(|b_{n}^{b}-b_{n}|>\frac{1}{(\lg\lg n)^{2}}\right)\rightarrow 0\;\;\;\;{\rm a.s.}.$

We have $b_{n}^{b}-b_{n}\rightarrow0$ in $p^{b}$ a.s.. Similarly, we can prove $a_{n}^{b}-a_{n}\rightarrow0$ in $p^{b}$ a.s.. Then we have $a_{n}\rightarrow a$ and $b_{n}\rightarrow b$ in $p^{b}$ a.s.. According to above equations, $\lim\limits_{n\rightarrow\infty} |Pr_{b}\{b_{n}^{b}-b_{n}|>\delta\}\rightarrow 0$ a.s. for any $\delta>0$. Since $E_{p}(I_{(-\infty, s)}-I_{(-\infty, t)})^{2}\leq |s-t|\sup\limits_{x\in K}f(x)$, we have the following equation according to (4.6),

$\lim\limits_{\delta\rightarrow 0}\lim\limits_{n\rightarrow\infty}\Pr\limits_{b}\{\sup\limits_{|s-t|<\delta}|\nu_{n}^{b}(-\infty, s]-\nu_{n}^{b}(-\infty, t]|>\varepsilon\}\rightarrow 0.$

Then taking limits first as $n\rightarrow\infty$ and then as $\delta\rightarrow0$, we can obtain the following result based on (4.9):

$\gamma_{1}=\mid \sqrt{n}\int_{b_{n}}^{b_{n}^{b}}dF_{n}^{b}-\sqrt{n}\int_{b_{n}}^{b_{n}^{b}}dF_{n}\mid=o_{p^{b}}(1) {\rm a.s.}.$ (4.10)

Next, we focus on $\gamma_{2}$.

$\quad \Pr\limits_{b}\{\mid \sqrt{n}\int_{b_{n}}^{b_{n}^{b}}dF_{n}-\sqrt{n}\int_{b_{n}}^{b_{n}^{b}}dF\mid>\varepsilon\} \\ = \Pr\limits_{b}\{\mid \nu_{n}(-\infty, b_{n}]-\nu_{n}(-\infty, b_{n}^{b})\mid>\varepsilon\} \\ \leq \Pr\limits_{b}\{\sup\limits_{|b_{n}-\lambda|< \frac{1}{(\log \log n)^{2}}}\mid \nu_{n}(-\infty, b_{n}]-\nu_{n}(-\infty, \lambda]\mid>\varepsilon\}+Pr_{b} \{|b_{n}-b_{n}^{b}|>\frac{1}{(\log \log n)^{2}}\} \\ = \xi_{1}+\xi_{2}.$ (4.11)

$\xi_{2}\rightarrow0$ in $P^{b}$ has been proven before. $\xi_{1}\rightarrow0$ in $P^{b}$ can be validated by the following theorem, see Gine [10].

Since the functions

$\mathcal{F}=\{I_[a, b](x), I_[-\infty, a](x), xI_{[a, b]}(x):\; -\infty<c_{1}\leq a<b\leq c_{2}<\infty\}$

is uniform $P$-Donsker, we have

$\lim\limits_{n\rightarrow\infty}\sup\limits_{g, h\in\mathcal{F};E_{p}(g-h)^{2}<1/(\log\log n)^{2}}\mid \nu_{n}(g)-\nu_{n}(h)\mid=0\;\;\; {\rm a.s.}.$

Then we have

$\gamma_{2}=\mid \sqrt{n}\int_{b_{n}}^{b_{n}^{b}}dF_{n}-\sqrt{n}\int_{b_{n}}^{b_{n}^{b}}dF\mid=0_{P^{b}}(1)\;\;\;{\rm a.s..}$ (4.12)

At last, we focus on $\gamma_{3}$:

$ \begin{aligned} \gamma_{3}&= \mid \sqrt{n}\left(\int_{b_{n}}^{b_{n}^{b}}dF\right)-f(b)\nu_{n}^{b}(h_{2})\mid \\ &= \mid \sqrt{n}(F(b_{n}^{b})-F(b_{n}))-f(b)\nu_{n}^{b}(h_{2})\mid \\ &= \mid \sqrt{n}(F(b_{n}^{b})-F(b_{n}))-f(b)\sqrt{n}(b_{n}^{b}-b_{n})+f(b)\cdot o_{P^{b}}(1)\mid. \end{aligned}$

Since $f(b)$ is bounded, $ \gamma_{3}=\mid \sqrt{n}(F(b_{n}^{b})-F(b_{n}))-f(b)\sqrt{n}(b_{n}^{b}-b_{n})\mid+o_{P^{b}}(1). $ Recall the set we defined before. $b_{n}^{b}$ will locate in $K$ as n becomes big enough with $Pr_{b}\rightarrow1$ ${\rm a.s..}$ Then there exists a point denoted $\eta_{n}$ between $b_{n}^{b}$ and $b_{n}$ such that

$ \begin{aligned} \gamma_{3}&= \mid \sqrt{n}(b_{n}^{b}-b_{n})f(\eta_{n})-f(b)\sqrt{n}(b_{n}^{b}-b_{n})\mid+o_{P^{b}}(1) \\ &= \mid\sqrt{n}(b_{n}^{b}-b_{n})( f(\eta_{n})-f(b))\mid+o_{P^{b}}(1) \\ &= (\nu_{n}^{b}(h_{2})+o_{P^{b}}(1))|f(\eta_{n})-f(b)|+o_{P^{b}}(1). \end{aligned}$

Since $f$ is continuous, $|f(\eta_{n})-f(b)|\rightarrow0$ in $P^{b}$ ${\rm a.s.}$. And $\nu_{n}^{b}(h_{2})$ is $O_{p^{b}}(1)$ a.s.. We have

$\gamma_{3}=o_{P^{b}}(1)\;\;\;\; {\rm a.s.}.$ (4.13)

Based on (4.9), (4.12) and (4.13), we have

$ \sqrt{n}\int_{b_{n}}^{b_{n}^{b}}dF_{n}^{b}-f(b)\nu_{n}^{b}(h_{2})=o_{P^{b}}(1)\;\;\;{\rm a.s.}.$ (4.14)

Similarly, we can prove

$ \sqrt{n}\int_{a_{n}^{b}}^{a_{n}}dF_{n}^{b}+f(a)\nu_{n}^{b}(h_{1})=o_{P^{b}}(1)\;\;\;{\rm a.s.}.$ (4.15)

Combine above results, we can prove (4.1). In the same spirit, we can also prove (4.2) and (4.3). Finally, Lemma 4 is proven.

Based on above lemma, we can prove that following theorem.

Theorem 3 Assume conditions (D.2), (D.3) and (LS) hold. Suppose $a, b\in B_{f}$, $a_{n}\rightarrow a$ a.s. and $b_{n}\rightarrow b$ a.s.. Assume also that $a_{n}^{b}$ and $b_{n}^{b}$, defined respectively as $a_{n}^{b}=a(F_{n}^{b})$ and $b_{n}^{b}=b(F_{n}^{b})$, satisfy

$({\rm Lb})\;\;\;\;\;\;\;\;\sqrt{n}(a_{n}^{b}-a_{n})= o_{P_{b}}(1), \;\;\sqrt{n}(b_{n}^{b}-b_{n})= o_{P_{b}}(1)\;\; {\rm a.s.}.$

Define $\theta_{n}^{b}$ as the bootstrap trimmed mean, $\displaystyle \theta_{n}^{b}=\frac{\sum\limits_{i=1}^{n}X_{n, i}^{b}I_{[a_{n}^{b}, b_{n}^{b}]}(X_{n, i}^{b})}{\sum\limits_{i=1}^{n}I_{[a_{n}^{b}, b_{n}^{b}]}(X_{n, i}^{b})}, \;\;\; n\in N, $ where $\theta_{n}$ and $\theta$ are as in Theorem 3. Define

$ (S_{n}^{b})^{2}=\frac{\sum\limits_{i=1}^{n}(x_{n, i}^{b}-\theta_{n}^{b})^{2}I_[a_{n}^{b}, b_{n}^{b}](x_{n, i}^{b})}{\sum\limits_{i=1}^{n}I_[a_{n}^{b}, b_{n}^{b}](x_{n, i}^{b})}. $

Denote $(\sigma_{n}^{b})^{2}=\frac{(S_{n}^{b})^{2}}{\int_{a_{n}^{b}}^{b_{n}^{b}}dF_{n}^{b}}$. We have

$\frac{\sqrt{n}(\theta_{n}^{b}-\theta_{n})}{\sigma_{n}^{b}}\rightarrow_{L^{b}}N(0, 1)\;\;\; {\rm a.s.}.$ (4.16)

Proof In the proof in above section, we have shown ${\rm Var}_{F}(g)=\frac{S^{2}}{\int_{a}^{b}dF(t)}$ where $S^{2}=\frac{\int_{a}^{b}(x-\theta)^{2}dF}{\int_{a}^{b}dF}$ and $\theta=\frac{\int_{a}^{b}xdF}{\int_{a}^{b}dF}$. Denote $\sigma^{2}=\frac{S^{2}}{\int_{a}^{b}dF(t)}$. We have

$ \frac{\sqrt{n}(\theta_{n}^{b}-\theta_{n})}{\sigma_{n}^{b}}=\frac{\sqrt{n}(\theta_{n}^{b}-\theta_{n})}{\sqrt{Var_{F}(g)}}\times \frac{\sqrt{{\rm Var}_{F}(g)}}{\sigma}\times \frac{\sigma}{\sigma_{n}^{b}}=\frac{\sqrt{n}(\theta_{n}^{b}-\theta_{n})}{\sqrt{{\rm Var}_{F}(g)}}\times \frac{\sigma}{\sigma_{n}^{b}}.$ (4.17)

Under more general conditions, CG has proven

$ \sqrt{n}(\theta_{n}^{b}-\theta_{n}) \rightarrow_{L_{b}} \sqrt{{\rm Var}_{F}(g)}Z.$ (4.18)

To prove (4.16), we only need show $\sigma_{n}^{b}\rightarrow_{Pr_{b}}\sigma \;\; {\rm a.s.}.$ Based on Lemma 4, we have

$ \begin{aligned} (S_{n}^{b})^{2}&= \frac{\int_{a_{n}^{b}}^{b_{n}^{b}}x^{2}dF_{n}^{b}}{\int_{a_{n}^{b}}^{b_{n}^{b}}dF_{n}^{b}}-\frac{\int_{a_{n}^{b}}^{b_{n}^{b}}xdF_{n}^{b}}{\int_{a_{n}^{b}}^{b_{n}^{b}}dF_{n}^{b}}\frac{\int_{a_{n}^{b}}^{b_{n}^{b}}xdF_{n}^{b}}{\int_{a_{n}^{b}}^{b_{n}^{b}}dF_{n}^{b}} \\ &= \frac{\int_{a_{n}}^{b_{n}}dF_{n}}{\int_{a_{n}^{b}}^{b_{n}^{b}}dF_{n}^{b}}\frac{\int_{a_{n}}^{b_{n}^{b}}x^{2}dF_{n}^{b}}{\int_{a_{n}}^{b_{n}}dF_{n}}-\frac{\int_{a_{n}}^{b_{n}}dF_{n}}{\int_{a_{n}^{b}}^{b_{n}^{b}}dF_{n}^{b}}\frac{\int_{a_{n}^{b}}^{b_{n}^{b}}xdF_{n}^{b}}{\int_{a_{n}}^{b_{n}}dF_{n}}\frac{\int_{a_{n}^{b}}^{b_{n}^{b}}xdF_{n}^{b}}{\int_{a_{n}}^{b_{n}}dF_{n}}\frac{\int_{a_{n}}^{b_{n}}dF_{n}}{\int_{a_{n}^{b}}^{b_{n}^{b}}dF_{n}^{b}}. \end{aligned}$ (4.19)

According to (4.1), $\displaystyle \int_{a_{n}^{b}}^{b_{n}^{b}}dF_{n}^{b}-\int_{a_{n}}^{b_{n}}dF_{n}=\frac{1}{\sqrt{n}}(\nu_{n}^{b}(g_{1}))+o_{P^{b}}(1)=o_{P^{b}}(1)\;\; {\rm a.s.}, $ where $\nu_{n}^{b}(g_{1})$ is bounded in $P^{b}$. We can write

$ \frac{\int_{a_{n}}^{b_{n}}dF_{n}}{\int_{a_{n}^{b}}^{b_{n}^{b}}dF_{n}^{b}}=1+\tau_{n},$ (4.20)

where $\tau_{n}\rightarrow0\;\;in\;\;P^{b}\;\; {\rm a.s.}$. Substitute (4.20) to (4.19):

$\begin{aligned} (S_{n}^{b})^{2}&= \frac{\int_{a_{n}^{b}}^{b_{n}^{b}}x^{2}dF_{n}^{b}}{\int_{a_{n}}^{b_{n}}dF_{n}}(1+\tau_{n})-\frac{\int_{a_{n}^{b}}^{b_{n}^{b}}x dF_{n}^{b}}{\int_{a_{n}}^{b_{n}}dF_{n}}\frac{\int_{a_{n}^{b}}^{b_{n}^{b}}x dF_{n}^{b}}{\int_{a_{n}}^{b_{n}}dF_{n}}(1+\tau_{n})^{2} \\ &= (\frac{\int_{a_{n}^{b}}^{b_{n}^{b}}x^{2}dF_{n}^{b}}{\int_{a_{n}}^{b_{n}}dF_{n}}-\frac{\int_{a_{n}^{b}}^{b_{n}^{b}}x dF_{n}^{b}}{\int_{a_{n}}^{b_{n}}dF_{n}}\frac{\int_{a_{n}^{b}}^{b_{n}^{b}}x dF_{n}^{b}}{\int_{a_{n}}^{b_{n}}dF_{n}})+\frac{\int_{a_{n}^{b}}^{b_{n}^{b}}x^{2}dF_{n}^{b}}{\int_{a_{n}}^{b_{n}}dF_{n}}\tau_{n} \\&\quad-\frac{\int_{a_{n}^{b}}^{b_{n}^{b}}x dF_{n}^{b}}{\int_{a_{n}}^{b_{n}}dF_{n}}\frac{\int_{a_{n}^{b}}^{b_{n}^{b}}x dF_{n}^{b}}{\int_{a_{n}}^{b_{n}}dF_{n}}2\tau_{n}+\frac{\int_{a_{n}^{b}}^{b_{n}^{b}}x dF_{n}^{b}}{\int_{a_{n}}^{b_{n}}dF_{n}}\frac{\int_{a_{n}^{b}}^{b_{n}^{b}}x dF_{n}^{b}}{\int_{a_{n}}^{b_{n}}dF_{n}}\tau_{n}^{2}\\&=\frac{\int_{a_{n}^{b}}^{b_{n}^{b}}x^{2}dF_{n}^{b}}{\int_{a_{n}}^{b_{n}}dF_{n}}-\frac{\int_{a_{n}^{b}}^{b_{n}^{b}}x dF_{n}^{b}}{\int_{a_{n}}^{b_{n}}dF_{n}}\frac{\int_{a_{n}^{b}}^{b_{n}^{b}}x dF_{n}^{b}}{\int_{a_{n}}^{b_{n}}dF_{n}}+o_{P^{b}}(1)\;\; {\rm a.s.}. \end{aligned}$

Based on Lemma 4, we have

$ \begin{aligned} (S_{n}^{b})^{2}&= \frac{\int_{a_{n}^{b}}^{b_{n}^{b}}x^{2}dF_{n}^{b}}{\int_{a_{n}}^{b_{n}}dF_{n}}-\frac{\int_{a_{n}^{b}}^{b_{n}^{b}}xdF_{n}^{b}}{\int_{a_{n}}^{b_{n}}dF_{n}}\frac{\int_{a_{n}^{b}}^{b_{n}^{b}}xdF_{n}^{b}}{\int_{a_{n}}^{b_{n}}dF_{n}}+o_{P^{b}}(1)\;\;{\rm a.s.} \\ &= \frac{1}{\sqrt{n}}\frac{\sqrt{n}\int_{a_{n}}^{b_{n}}x^{2}dF_{n}+\nu_{n}^{b}(g_{3})}{\int_{a_{n}}^{b_{n}}dF_{n}}-\frac{1}{n}\frac{(\sqrt{n}\int_{a_{n}}^{b_{n}}xdF_{n}+\nu_{n}^{b}(g_{2}))^{2}}{\int_{a_{n}}^{b_{n}}dF_{n}\int_{a_{n}}^{b_{n}}dF_{n}}+o_{P^{b}}(1)\;\;{\rm a.s.} \\ &= \frac{\int_{a_{n}}^{b_{n}}x^{2}dF_{n}}{\int_{a_{n}}^{b_{n}}dF_{n}}- (\frac{\int_{a_{n}}^{b_{n}}x^{2}dF_{n}}{\int_{a_{n}}^{b_{n}}dF_{n}})^{2}+\frac{1}{\sqrt{n}}\frac{\nu_{n}^{b}(g_{3})}{\int_{a_{n}}^{b_{n}}dF_{n}}-\frac{2}{\sqrt{n}}\frac{\int_{a_{n}}^{b_{n}}xdF_{n}}{(\int_{a_{n}}^{b_{n}}dF_{n})^{2}}\nu_{n}^{b}(g_{2}) \\ &\quad - \frac{1}{n}\frac{(\nu_{n}^{b}(g_{2}))^{2}}{(\int_{a_{n}}^{b_{n}}dF_{n})^{2}} +o_{P^{b}}(1)\;\;\;{\rm a.s.} \\ &= \frac{\int_{a_{n}}^{b_{n}}x^{2}dF_{n}}{\int_{a_{n}}^{b_{n}}dF_{n}}-(\frac{\int_{a_{n}}^{b_{n}}xdF_{n}}{\int_{a_{n}}^{b_{n}}dF_{n}})^{2}+o_{P^{b}}(1)\;\;\;{\rm a.s.} \\ &= S_{n}^{2}+o_{P^{b}}(1)\;\;\;{\rm a.s..} \end{aligned}$

Under the assumption $a_{n}\rightarrow a$ ${\rm a.s.}$ and $b_{n}\rightarrow b$ $\rm {a.s.}$, we can prove $S_{n}^{2}\rightarrow S^{2}$ ${\rm a.s.}$ in the similar spirit as Lemma 3. Then we have $(S_{n}^{b})^{2}=S^{2}+o_{P^{b}}(1)\;\;$ a.s.. We can also prove $\int_{a_{n}}^{b_{n}}dF_{n}\rightarrow\int_{a}^{b}dF$ a.s..

Based on the definition of $\sigma_{n}^{b}$, we have

$ (\sigma_{n}^{b})^{2}=\frac{(S_{n}^{b})^{2}}{\int_{a_{n}^{b}}^{b_{n}^{b}}dF_{n}^{b}}= \frac{S^{2}}{\int_{a}^{b}dF}\frac{\int_{a}^{b}dF}{\int_{a_{n}}^{b_{n}}dF_{n}}\frac{\int_{a_{n}}^{b_{n}}dF_{n}}{\int_{a_{n}^{b}}^{b_{n}^{b}}dF_{n}^{b}}=\sigma^{2}+o_{P^{b}}(1)\;\;\;{\rm a.s.}. $

Then the theorem is proven.

5 An Example

To examine the application of above theorems, we consider the following simple example in order statistics. Let $X$, $X_{1}$, $\cdots$, $X_{n}$, $\cdots$ be independent identically distributed real random variables on support $[a, b]$. Suppose $f$ is their density function, with $f(X)\neq 0, X\in[a, b]$. Let $a_{n}=X_{(j_{n})}$, $b_{n}=X_{(n-j_{n})}$. Then, if $j_{n}=o(\sqrt{n})$, we have

$\sqrt{n}(a_{n}-a)\rightarrow 0, \;\;\;\sqrt{n}(b_{n}-b)\rightarrow 0.$ (5.1)

To prove above equations, we may assume, without loss of generality that $a=0$. $F(X)$ is uniform on [0, 1] and $(F(X))_{(j_{n})}=F(X_{(j_{n})})$ by monotonicity. Also,

$P(X_{(j_{n})}>\frac{\varepsilon}{\sqrt{n}})=P(F(X_{(j_{n})})>F(\frac{\varepsilon}{\sqrt{n}}))$

and $F(\frac{\varepsilon}{\sqrt{n}})\approx f(0)\frac{\varepsilon}{\sqrt{n}}. $ Hence we may also assume that the law of $X$ is uniform on [0, 1] It is well known that the law of $X_{(j_{n})}$ in this case is $ L(X_{(j_{n})})=L(\frac{w_{1}+\cdots+w_{j_{n}}}{w_{1}+\cdots+w_{n}}), $ where $w_{i}$ are i.i.d exponential with $\lambda=1$ (e.g. Breiman [11]). Then

$P\left(X_{(j_{n})}>\frac{\varepsilon}{\sqrt{n}}\right)=P\left(\frac{w_{1}+\cdots+w_{j_{n}}-j_{n}}{\sqrt{j_{n}}}>\frac{\varepsilon}{\sqrt{nj_{n}}}(w_{1}+\cdots+w_{n}-n)+\frac{\varepsilon\sqrt{n}}{\sqrt{j_{n}}}-\sqrt{j_{n}}\right).$

and assuming $j_{n}\rightarrow\infty$, by the central limit theorem, this probability tends to zero if and only if $\frac{\varepsilon\sqrt{n}}{\sqrt{j_{n}}}-\sqrt{j_{n}}\rightarrow\infty$, which happens in all $\varepsilon>0$ if and only if $j_{n}=o(\sqrt{n})$.

In this case, we can define the trimmed mean

$\theta_{n}=\frac{\sum\limits_{i=1}^{n}X_{i}I(X_{(j_{n}+1)}\leq X_{i}\leq X_{(n-j_{n})})}{\sum\limits_{i=1}^{n}I(X_{(j_{n}+1)}\leq X_{i}\leq X_{(n-j_{n})})}=\frac{\sum\limits_{j=j_{n}+1}^{n-j_{n}}X_{(j)}}{n-2j_{n}},$

which is the average of the data with the smallest $j_{n}$ and the largest $j_{n}$ data removed (outliers). Let $\theta(P)=E(X)$. According to Theorem 3, we have

$\frac{\sqrt{n}(\theta_{n}-E(X))}{\sigma_{n}}\rightarrow_{d}N(0, 1), \quad \text{where} \quad \sigma_{n}^{2}=\frac{\sum\limits_{j=j_{n}+1}^{n-j_{n}}(X_{(j)}-\theta_{n})^{2}}{(n-2j_{n})^{2}}.$
References
[1] Chen Z, Giné. Another approach to asymptotics and bootstrap of randomly trimmed means[J]. Evarist Annals Institute of Stat. Math., 2004, 56(4): 771–790. DOI:10.1007/BF02506489
[2] Hampel F. A general qualitative deflnition of robustness[J]. Annals Math. Stat., 1971, 42(6): 1887–1896. DOI:10.1214/aoms/1177693054
[3] Hall P, Padmanabhan A R. On the bootstrap and the trimmed mean[J]. J. Multi. Anal., 1992, 41(1): 132–153. DOI:10.1016/0047-259X(92)90062-K
[4] Shorack G R. Uniform CLT, WLLN, LIL and bootstrapping in a data analytic approach to trimmed L-statistics[J]. J. Stat. Plan. Infer., 1997, 60(1): 1–44. DOI:10.1016/S0378-3758(97)00121-3
[5] Hampel F. The breakdown points of the mena combined with some rejection rules[J]. Technometrics, 1985, 27(2): 95–107. DOI:10.1080/00401706.1985.10488027
[6] Seong Ju Kim. The metrically trimmed mean as a measure of location[J]. Annals Stat, 1992, 20(3): 1534–1547. DOI:10.1214/aos/1176348783
[7] Huber P. Robust statistics[M]. New York: Wiley, 1981.
[8] Dudley R. Uuiform central limit theorems[M]. New York: Cambridge University Press, 1999.
[9] Hall P. Rate of convergence in bootstrap approximations[J]. The Annals of Probability, 1988, 16(4): 1665–1684. DOI:10.1214/aop/1176991590
[10] Gine E. Lectures on some aspects of the bootstrap[J]. Lecture Notes in MathEmatics, 1996: 37–151.
[11] Breiman L. Probablity[M]. New Jersey: Addison-Wesley, 1968.