It is well known that the sample mean is very sensitive to outliers in the data. As an alternative approach, randomly trimmed means have been studied widely in statistics. We can construct a randomly trimmed mean as follows: choose two levels $a_{n}<b_{n}$ depending on the sample and calculate the average of the data points between $a_{n}$ and $b_{n}$. Above levels can be obtained by evaluating empirical distribution $F_{n}$ of functions $a(F)$ and $b(F)$ where $F$ is the probability distribution of $X$. If $a(F)=F^{-1}(\alpha/2)$ and $b(F)=F^{-1}(1-\alpha/2)$, we can have the classical trimming. Let $a(F)=F^{-1}(\alpha/2)$ and $b(F)=F^{-1}(1-\alpha/2)$, $X$ is trimmed at level $\alpha$. Let $a(F)=\mu(F)-cs(F)$ and $b(F)=\mu(F)-cs(F)$ where $\mu$ denotes median, $s$ denotes median absolute deviation (MAD), and $c$ is a constant of choice, we can have trimmed mean of Hampel [2].
Trimmed means are robust and can be bootstrapped. Hall and Padmanahban [3] study the bootstrap for the studentized classical trimmed mean. Shorack [4] provides a systematic study of bootstrapping to various L-statistics. In a recent paper, Chen and Gine [1] (hence CG, 2004) present a unified, empirical process based approach to the central limit theorem and the bootstrap central limit theorem for the general trimmed mean under mild assumptions on the levels $a(F)$ and $b(F)$. Their basic assumption is that $\sqrt{n}(a_{n}-a)$ can be asymptotically linearized. It is asymptotically equivalent to $\sum\limits_{i=1}^{n}(h_{1}(X_{i})-Eh_{1}(X))/\sqrt{n}$ for some square integrable function $h_{1}$ for $F$. Similar conditions also hold for $\sqrt{n}(b_{n}-b)$.
Despite their generality, the central limit theorems of CG include a complicated expression for the asymptotic variance of trimmed means that depends on the underlying density of $X$. Following the empirical process approach of CG, we study studentized randomly trimmed means and their bootstrap under relaxed conditions. We can avoid the complex expression of variance in our central limit theorems. The results in this thesis may be used to obtain asymptotic confidence intervals and tests for trimmed means and also goes to the confidence intervals and tests to the usual means. We do not pursue that here.
This thesis is organized as follows. In Section 2, we introduce some definitions and major results of CG. Section 3 studies the asymptotic properties of studentized trimmed means. Section 4 verifies the validity of bootstrap. A simple example is introduced in Section 5.
Before describing our findings, we will introduce some basic definitions in this section and will review some existing results.
Let $X, X_{1}, \cdots, X_{n}, \cdots$ be independent identically distributed real random variables with common probability law $P$ and, for each $n\in N$, let $P_{n}=\frac{1}{n}\sum\limits_{i=1}^{n}\delta_{X_{i}}$ be the empirical measure corresponding to the first $n$ observations $X_{1}, \cdots, X_{n}$. $F$ and $F_{n}$ denote the cumulative distribution functions associated to $P$ and $P_{n}(\omega)$ for all $n\in N$ and $\omega\in\Omega$. In addition, we assume $a<b$ and s.t. $F(b(P))-F(a(P)-)\neq 0$ and $a\leq b$. The trimmed mean of $P$ based on $a$ and $b$ is defined as
For convenience, we denote $a(P)$ and $b(P)$ as $a$ and $b$ in the following. And $a_{n}$ and $b_{n}$ are for $a(P_{n})$ and $b(P_{n})$. Similarly as (2.1), we can also define the empirical trimmed mean based on a and b as follows:
Given $X, X_{i}, i\in N$ with c.d.f. $F$ and density $f$. Let $P_{n}=\frac{1}{n}\sum\limits_{i=1}^{n}\delta_{X_{i}}$ be the empirical measure and $\nu_{n}=\sqrt{n}(P_{n}-P)$ the empirical process. Let $-\infty<a<b<\infty$ and let $a_{n}, b_{n}$ be random variables such that $-\infty<a_{n}\leq b_{n}<\infty$ ${\rm a.s.}$. Following CG, we assume the following conditions:
(D.1) The c.d.f. $F$ has a derivative $f$ on an open set containing $a$ and $b$ and $f$ is continuous there, hence, $f$ is uniformly continuous on a compact set $K$ whose interior contains $a$ and $b$.
(D.2) $f(a)+f(b)\neq 0$.
(L) There exist measurable, $P$-square integrable functions $h_{1}$ and $h_{2}$ such that
Based on the above framework, CG proves the following lemma and theorem:
Lemma 1 Assume (D.1) and (L). Define, for $x\in R$,
Then
Theorem 1 Assume (D.1), (D.2) and (L), and set
with $g_{1}$ and $g_{2}$ as defined in Lemma 1. Let $\theta_{n}$ be the trimmed mean based on $a_{n}$ and $b_{n}$ and let $\theta$ be its population counterpart, as defined in above. Then
in distribution, where $Z$ is standard normal.
Theorem 1 is a very general result. It includes many cases studied in previous literature like Hample's means (Hampel [2, 5]), Symmetrically trimmed means (Huber [6]) and Kim's metrically trimmed means (Kim [7]). However, the variance of randomly trimmed means, ${\rm Var}_{F}(g)$, in Theorem 1 has a very complicated from. Especially, it depends on the underlying density $f(a)$ and $f(b)$. With these constraints, Theorem 1 may be difficult to implement in some practice, for example, to construct tests of sample means. To overcome the above deficiency, we prove a studentized central limit theorem of randomly trimmed means whose asymptotic properties do not depend on the underlying density under relaxed conditions. Furthermore, the bootstrap is also valid in our case.
In the similar spirit as Lemma 1, we prove the following lemma. It plays an important role in our proof of studentized central limit theorem in the next.
Lemma 2 Assume (D.1) and (L) hold. Define, for $x\in R$,
Proof We can write the left side of (3.1) as follows:
Since $x$ is on the compact set $K$, $x^{2}$ is bounded. Then we have that
is finite. Also, there is $\delta_{0}>0$ such that $[a-\delta_{0}, a+\delta_{0}]\cup[b-\delta_{0}, b+\delta_{0}]\in K$. Therefore, by the asymptotic equicontinuity of the empirical process, see Dudley [8], we have
for all $\varepsilon>0$. Since
the equicontinuity condition and condition (L) give, upon taking limits first as $n\rightarrow\infty$ and then as $\delta\rightarrow 0$, \begin{equation*} \sqrt{n}\int _{a_{n}}^{b_{n}}x^{2}d(F_{n}-F)=\nu_{n}(x^{2}I_{[a, b]}(x))+o_{p}(1). \end{equation*} Given $ 0<\delta\leq\delta_{0}$ and $M_{n}, n\in N$, such that $M_{n}\rightarrow \infty$ and $M_{n}/\sqrt{n}\rightarrow 0$, set
which tends to zero by uniform continuity of $f$ on $K$. Then, on the event where $|\nu_{n}(h_{2})-\sqrt{n}(b_{n}-b)|\leq \delta$ and $|\nu_{n}(h_{2})|\leq M_{n}$, we have
We get the last equation above since $\frac{\delta+M_{n}}{\sqrt{n}}\rightarrow 0$. Based on the above results, we have the following equation, for any $\varepsilon>0$:
Then we have
Similarly, we can prove
Then the lemma is proven.
Based on above lemma, we can prove the following result:
Lemma 3 Assume conditions (D.1), (D.2) and (L) hold. Denote
where $\theta_{n}=\frac{\int_{a_{n}}^{b_{n}}xdF_{n}}{\int_{a_{n}}^{b_{n}}dF_{n}}$. Then we have $S_{n}^{2}=S^{2}+o_{p}(1)$, where
Proof
Since $\sqrt{n}[\int_{a_{n}}^{b_{n}}dF_{n}-\int_{a}^{b}dF]=\nu_{n}(g_{1})+o_{p}(1)$,
We can write $\frac{\int_{a}^{b}dF}{\int_{a_{n}}^{b_{n}}dF_{n}}=1+\tau_{n}$ where $\tau_{n}\rightarrow 0$ in probability. Then we have
Based on Lemma 1 and Lemma 2, we have
The lemma is proven.
Based on above lemmas, we can obtain the following theorem.
Theorem 2 Assume conditions (D.1), (D.2) and (LS) hold:
(LS) $\sqrt{n}(a_{n}-a)=o_{p}(1)\;\;\text{and}\;\;\sqrt{n}(b_{n}-b)=o_{p}(1)$.
Define $\sigma_{n}^{2}=\frac{S_{n}^{2}}{\int_{a_{n}}^{b_{n}}dF_{n}}=\frac{\int_{a_{n}}^{b_{n}}(x-\theta_{n})^{2}dF_{n}}{(\int_{a_{n}}^{b_{n}}dF_{n})^{2}}$, we have the studentized trimmed mean $\frac{\sqrt{n}(\theta_{n}-\theta)}{\sigma_{n}}\rightarrow N(0, 1)$ in distribution, where $\theta$ is the population randomly trimmed mean $\theta=\frac{\int_{a}^{b}tdF}{\int_{a}^{b}dF}$.
Proof According to conditions (L) and (LS), $h_{1}=h_{2}=0$. We have
and
Since $E_{F}(g)=\frac{\int_{a}^{b}tdF(t)}{\int_{a}^{b}dF(t)}-\frac{\int_{a}^{b}tdF(t)}{\int_{a}^{b}dF(t)}=0$, we have
Since $S_{n}^{2}=S^{2}+o_{p}(1)$,
According to Theorem 1, we have $\frac{\sqrt{n}(\theta_{n}-\theta)}{\sigma_{n}}\rightarrow N(0, 1)$ in distribution.
In the above section, we study the asymptotic properties of studentized trimmed means. But with a small sample data, the estimators and tests based on those asymptotic properties may not be accurate. In this case, bootstrap may provide some efficient improvement. In the case of the mean, Hall [9] proved that if $E x^{3}<\infty$ then the bootstrap approximation is better than the normal. Although we do not know if this is the case for the trimmed mean, it is interesting to know that the studentized CLT in the trimmed mean can be bootstrapped.
With the sample $X_{1}, \cdots, X_{n}$, we take $n$ samples from it with replacement. We denote $X_{n, 1}^{b}, \cdots, X_{n, n}^{b}$ as the $n$th bootstrap sample. Following the notion of CG, we write $F_{n}^{b}, P_{n}^{b}$ and $\nu_{n}^{b}$ respectively the empirical c.d.f, the empirical measure and the empirical process based on this sample: $P_{n}^{b}(A)=\frac{1}{n}\sum\limits_{i=1}^{n}\delta_{X_{n, i}^{b}}(A)$, $F_{n}^{b}(x)=P_{n}^{b}(-\infty, x]$ and $\nu_{n}^{b}=\sqrt{n}(p_{n}^{b}-p_{n})$. In addition, we denote by $\Pr_{b}=\Pr_{b}(\omega)$ the conditional probability given the sample $X_{1}, \cdots, X_{n}$ (its dependence on $\omega$ will not be displayed for convenience). Also, $L^{b}$ will denote conditional law given the sample. The symbol $o_{p_{b}}(1)$ means the following: $V_{n}(X_{n, 1}^{b}, X_{n, n}^{b}, X_{1}, \cdots, X_{n})$ is $o_{p_{b}}(1)$ a.s if a.s. $\Pr_{b}\{|V_{n}|>\varepsilon\}\rightarrow 0$ for every $\varepsilon>0$. This is also equivalent to: in almost every $\omega$, every subsequence of $V_{n}(\omega, \omega')$ has a subsequence that converges $\omega'-{\rm a.s}.$.
(D.3) $P$ has a density $f$ on $R$, the set $B_{f}=\{f>0\}$ is open and $f$ is continuous on $B_{f}$.
CG proves a central limit theorem which can be considered as the bootstrap version of Theorem 1. Under the similar conditions, the bootstrap is also valid for the studientized trimmed means CLT in Theorem 3. To show this result, we need prove the following lemma at first.
Lemma 4 Assume (D.2), (D.3), (L), $a, b\in B_{f}$, $a_{n}\rightarrow a$ ${\rm a.s.}$ and $b_{n}\rightarrow b$ ${\rm a.s.}$. Assume also that $a_{n}^{b}$ and $b_{n}^{b}$, defined respectively as $a_{n}^{b}=a(F_{n}^{b})$ and $a_{n}^{b}=a(F_{n}^{b})$, satisfy
Define, for $x\in R$,
We have
Proof We provide the complete proof for (4.1) in the following. (4.2) and (4.3) can be proven in the similar spirit. Let us denote
According to Gine [10], the class of functions
is uniform P-Donsker. We have
where $\rightarrow _{L^{b}}$ denotes convergence in law conditionally on the sample and $G_{p}$ is a centered Gaussian process. Then $\nu_{n}^{b}(I_{[a, b]}(x))\rightarrow _{L^{b}}G_{p}(I_{[a, b]}(x))$.
Since
To prove $\sqrt{n}\int_{a_{n}}^{b_{n}}d(F_{n}^{b}-F_{n})\rightarrow_{L^{b}}G_{p}(I_{[a, b]}(x))\;\; {\rm a.s.}$, we only need show
According to Gine [10],
where the class of functions $\mathcal{F}$ is uniform $P$-Donsker. In addition,
Based on (4.6) and (4.7), we have
Then (4.5) is proven.
Next, we will prove
We can write
In the following, we will prove the convergence of above three terms separately. At first, we can write
Then given $\varepsilon$,
According to conditions in the lemma, we know that $\sqrt{n}(b_{n}^{b}-b_{n})-v_{n}^{b}(h_{2})\rightarrow 0$ in $\Pr_{b}$ ${\rm a.s.}$ and we also know $\nu_{n}^{b}(h_{2})\rightarrow_{d^{b}}N(0, {\rm Var}_{p}(h_{2}))$. So, we have
Let $M_{n}=\frac{\sqrt{n}}{(\lg \lg n)^{2}}\rightarrow \infty$, $\Pr_{b}\left(\sqrt{n}|b_{n}^{b}-b_{n}|>\frac{\sqrt{n}}{(\lg\lg n)^{2}}\right)\rightarrow 0$ a.s.. Then
We have $b_{n}^{b}-b_{n}\rightarrow0$ in $p^{b}$ a.s.. Similarly, we can prove $a_{n}^{b}-a_{n}\rightarrow0$ in $p^{b}$ a.s.. Then we have $a_{n}\rightarrow a$ and $b_{n}\rightarrow b$ in $p^{b}$ a.s.. According to above equations, $\lim\limits_{n\rightarrow\infty} |Pr_{b}\{b_{n}^{b}-b_{n}|>\delta\}\rightarrow 0$ a.s. for any $\delta>0$. Since $E_{p}(I_{(-\infty, s)}-I_{(-\infty, t)})^{2}\leq |s-t|\sup\limits_{x\in K}f(x)$, we have the following equation according to (4.6),
Then taking limits first as $n\rightarrow\infty$ and then as $\delta\rightarrow0$, we can obtain the following result based on (4.9):
Next, we focus on $\gamma_{2}$.
$\xi_{2}\rightarrow0$ in $P^{b}$ has been proven before. $\xi_{1}\rightarrow0$ in $P^{b}$ can be validated by the following theorem, see Gine [10].
Since the functions
is uniform $P$-Donsker, we have
At last, we focus on $\gamma_{3}$:
Since $f(b)$ is bounded, $ \gamma_{3}=\mid \sqrt{n}(F(b_{n}^{b})-F(b_{n}))-f(b)\sqrt{n}(b_{n}^{b}-b_{n})\mid+o_{P^{b}}(1). $ Recall the set we defined before. $b_{n}^{b}$ will locate in $K$ as n becomes big enough with $Pr_{b}\rightarrow1$ ${\rm a.s..}$ Then there exists a point denoted $\eta_{n}$ between $b_{n}^{b}$ and $b_{n}$ such that
Since $f$ is continuous, $|f(\eta_{n})-f(b)|\rightarrow0$ in $P^{b}$ ${\rm a.s.}$. And $\nu_{n}^{b}(h_{2})$ is $O_{p^{b}}(1)$ a.s.. We have
Based on (4.9), (4.12) and (4.13), we have
Combine above results, we can prove (4.1). In the same spirit, we can also prove (4.2) and (4.3). Finally, Lemma 4 is proven.
Based on above lemma, we can prove that following theorem.
Theorem 3 Assume conditions (D.2), (D.3) and (LS) hold. Suppose $a, b\in B_{f}$, $a_{n}\rightarrow a$ a.s. and $b_{n}\rightarrow b$ a.s.. Assume also that $a_{n}^{b}$ and $b_{n}^{b}$, defined respectively as $a_{n}^{b}=a(F_{n}^{b})$ and $b_{n}^{b}=b(F_{n}^{b})$, satisfy
Define $\theta_{n}^{b}$ as the bootstrap trimmed mean, $\displaystyle \theta_{n}^{b}=\frac{\sum\limits_{i=1}^{n}X_{n, i}^{b}I_{[a_{n}^{b}, b_{n}^{b}]}(X_{n, i}^{b})}{\sum\limits_{i=1}^{n}I_{[a_{n}^{b}, b_{n}^{b}]}(X_{n, i}^{b})}, \;\;\; n\in N, $ where $\theta_{n}$ and $\theta$ are as in Theorem 3. Define
Denote $(\sigma_{n}^{b})^{2}=\frac{(S_{n}^{b})^{2}}{\int_{a_{n}^{b}}^{b_{n}^{b}}dF_{n}^{b}}$. We have
Proof In the proof in above section, we have shown ${\rm Var}_{F}(g)=\frac{S^{2}}{\int_{a}^{b}dF(t)}$ where $S^{2}=\frac{\int_{a}^{b}(x-\theta)^{2}dF}{\int_{a}^{b}dF}$ and $\theta=\frac{\int_{a}^{b}xdF}{\int_{a}^{b}dF}$. Denote $\sigma^{2}=\frac{S^{2}}{\int_{a}^{b}dF(t)}$. We have
Under more general conditions, CG has proven
To prove (4.16), we only need show $\sigma_{n}^{b}\rightarrow_{Pr_{b}}\sigma \;\; {\rm a.s.}.$ Based on Lemma 4, we have
According to (4.1), $\displaystyle \int_{a_{n}^{b}}^{b_{n}^{b}}dF_{n}^{b}-\int_{a_{n}}^{b_{n}}dF_{n}=\frac{1}{\sqrt{n}}(\nu_{n}^{b}(g_{1}))+o_{P^{b}}(1)=o_{P^{b}}(1)\;\; {\rm a.s.}, $ where $\nu_{n}^{b}(g_{1})$ is bounded in $P^{b}$. We can write
where $\tau_{n}\rightarrow0\;\;in\;\;P^{b}\;\; {\rm a.s.}$. Substitute (4.20) to (4.19):
Based on Lemma 4, we have
Under the assumption $a_{n}\rightarrow a$ ${\rm a.s.}$ and $b_{n}\rightarrow b$ $\rm {a.s.}$, we can prove $S_{n}^{2}\rightarrow S^{2}$ ${\rm a.s.}$ in the similar spirit as Lemma 3. Then we have $(S_{n}^{b})^{2}=S^{2}+o_{P^{b}}(1)\;\;$ a.s.. We can also prove $\int_{a_{n}}^{b_{n}}dF_{n}\rightarrow\int_{a}^{b}dF$ a.s..
Based on the definition of $\sigma_{n}^{b}$, we have
Then the theorem is proven.
To examine the application of above theorems, we consider the following simple example in order statistics. Let $X$, $X_{1}$, $\cdots$, $X_{n}$, $\cdots$ be independent identically distributed real random variables on support $[a, b]$. Suppose $f$ is their density function, with $f(X)\neq 0, X\in[a, b]$. Let $a_{n}=X_{(j_{n})}$, $b_{n}=X_{(n-j_{n})}$. Then, if $j_{n}=o(\sqrt{n})$, we have
To prove above equations, we may assume, without loss of generality that $a=0$. $F(X)$ is uniform on [0, 1] and $(F(X))_{(j_{n})}=F(X_{(j_{n})})$ by monotonicity. Also,
and $F(\frac{\varepsilon}{\sqrt{n}})\approx f(0)\frac{\varepsilon}{\sqrt{n}}. $ Hence we may also assume that the law of $X$ is uniform on [0, 1] It is well known that the law of $X_{(j_{n})}$ in this case is $ L(X_{(j_{n})})=L(\frac{w_{1}+\cdots+w_{j_{n}}}{w_{1}+\cdots+w_{n}}), $ where $w_{i}$ are i.i.d exponential with $\lambda=1$ (e.g. Breiman [11]). Then
and assuming $j_{n}\rightarrow\infty$, by the central limit theorem, this probability tends to zero if and only if $\frac{\varepsilon\sqrt{n}}{\sqrt{j_{n}}}-\sqrt{j_{n}}\rightarrow\infty$, which happens in all $\varepsilon>0$ if and only if $j_{n}=o(\sqrt{n})$.
In this case, we can define the trimmed mean
which is the average of the data with the smallest $j_{n}$ and the largest $j_{n}$ data removed (outliers). Let $\theta(P)=E(X)$. According to Theorem 3, we have