Quantile is an important population characteristic. In some instances the quantile approach is feasible and useful when other approaches are out of the question. For example, to estimate the parameter of a Cauchy distribution, with density $f(x)=1/\pi[1+ (x-\mu)^2], -\infty< x < \infty$, the sample mean $\overline{X}$ is not a consistent estimate of the location parameter $\mu$. However, the sample median $\theta_{1/2}$ is $AN(\mu, \pi^2/4n)$ and thus quite well-behaved.
Let $X_1, X_2, \cdots, X_n$ be a random sample from the unknown distribution $F(x)$ with density $f(x)$. Given $0 < q < 1$, we define the $q$-th quantile by $F^{-1}(q)=\inf\{x: F(x)>q\}$.
In this paper, we investigate how to apply empirical likelihood methods for inference about $\theta_q=F^{-1}(q)$ under right censorship. Assume that the variable $X$ is censored randomly on the right by some censoring variable $C$ and hence cannot be observed completely. One observes only
where $I(A)$ is the indicator function of event $A$. Supposed that, $C$ is independent of $X$. The observations are $\{Y_i, \delta_i\}_{i=1}^n$, which is a random sample from the population $(Y, \delta)$.
Empirical likelihood methods were first used by Thomas and Grunkemeier [1] and popularized by Owen [2-3]. It is well-known that Owen's empirical likelihood is based on linear constraints and hence has very general applicability such as in smooth functions of means (see DiCiccio et al. [4]), quantile estimation (see Chen [5]), estimating equation (see Qin and Lawless [6]), empirical likelihood confidence interval (see [7-16]) and so on. For more details, we refer to Owen [17]. However, most of the references on empirical likelihood are concerned with complete data set. In practice, censoring data occurs in opinion polls, market research surveys, mail enquires, social-economic investigations, medical studies and other scientific experiments. Once the censoring values are imputed, the data set can then be analyzed using standard techniques for complete data.
The rest of this paper is arranged as follows. In Section 2, we propose a empirical likelihood method to quantiles. We obtain the empirical log-likelihood ratio statistics of the quantiles and show that it is asymptotically chi-square. The proof is arranged to Section 3.
If $\theta_q$ is $q$-quantile for $F(x)$, we know that $\theta_q$ coincides with the $M$-estimates defined by the equation
with
Let $U_i(\theta_q)=\frac{\phi(Y_i-\theta_q)\delta_i}{1-G(Y_i)}$, where $G(\cdot)$ is the cumulative distribution function of the censoring variable $C$. Obviously $\{U_i(\theta_q)\}_{i=1}^n$ are independent and identically distributed random variables. Furthermore,
Thus, following the idea of Owen [2] an empirical likelihood-ratio function can similarly be defined for $\theta_q$ as
However, we cannot use $R(\theta_q)$ directly to make inference on $\theta_q$ since the distribution function $G(\cdot)$ of $\{Z_i\}_{i=1}^n$ is unknown. To do this, it is natural to replace $G(\cdot)$ by the Kaplan-Meier estimator
where $Y_{(1)}\leq Y_{(2)}\leq\cdots\leq Y_{(n)}$ is the order statistics of $Y_i$'s. Let $\hat{U}_i(\theta_q)=\frac{\phi(Y_i-\theta_q)\delta_i}{1-\hat{G}_n(Y_i)}$, then an estimated empirical log-likelihood ratio function can be defined as
By the method of Lagrange multiplier for (2.4), we may prove that the maximization point occurs with
where $\lambda(\theta_q)$ is the solution to
By (2.4) and (2.5), we can obtain
Theorem 2.1 If $E(U_i^2(\theta_q))<\infty$ and $\theta_q$ is the true $q$-quantile of $F(\cdot)$, we have
where $\rightarrow^L$ represents the convergence in distribution, $\chi^2_1$ is standard Chi-square random variable with $1$ degree of freedom.
Remark On the basis of Theorem 2.1, $-2\log \hat{R}(\theta_q)$ can be used to construct a confidence region for $\theta_q$,
with $P(\chi^2_1\leq c_\alpha)=1-\alpha$. Then by Theorem 2.1, $\hat{I}_{\alpha}(\theta_q)$ gives a confidence interval for $\theta_q$ with asymptotically correct coverage probability $1-\alpha$.
Throughout this section, we use $c>0$ to represent any constant which may take different values for each appearance.
Lemma 3.1 Under the assumptions of Theorem 2.1, if $\theta_q$ is the true $q$-quantile of $F(\cdot)$, we have
where $v^2_q=E(U_i^2(\theta_q))$.
Proof By the definition of $\hat{U}_i(\theta_q)$, it is easy to show that
Since $\{U_i(\theta_q)\}_{i=1}^n$ are independent and identically distributed random variables and $E(U_i^2(\theta_q))<\infty$. They imply $\frac{1}{\sqrt{n}}\sum\limits_{i=1}^n|U_i(\theta_q)|=O(1)$. Zhou [18] proved
where $Y_{(n)}=\max\limits_{1\leq i\leq n}Y_i$. So we have
By the central limit theory of independent and identically distributed random variables,
This completes the proof.
Lemma 3.2 Under the assumptions of Theorem 2.1, if $\theta_q$ is the true $q$-quantile of $F(\cdot)$, we have
Next, from the law of large numbers and (3.2), we can get $n^{-1}\sum\limits_{i=1}^nU_i^2(\theta_q)\rightarrow E(U_i^2(\theta_q)), $ a.s., then we have
By using the similar arguments as $I_2=o_p(1)$, we can also obtain $I_3=o_p(1)$. So, the law of large numbers implies that
Lemma 3.3 Under the assumptions of Theorem 2.1, if $\theta_q$ is the true $q$-quantile of $F(\cdot)$, we have
Proof It is well known that for any sequence of independent and identically distributed random variables $\{\xi\}_{i=1}^n$ with $E(\xi_i^2)<\infty$, we have
This implies that $\max\limits_{1\leq i\leq n}|U_i(\theta_q)|=o_p(n^{1/2})$. From (3.2), we have
Next, we prove $\lambda(\theta_q)=O_p(n^{-1/2})$. Let $\lambda(\theta_q)=\alpha|\lambda(\theta_q)|$, where $\alpha=1$ or $-1$. Note $\bar{\Lambda}=\frac{1}{n}\sum\limits_{i=1}^n\hat{U}_i(\theta_q)$, $\Lambda^{*}=\max\limits_{1\leq i\leq n}|\hat{U}_i(\theta_q)|$, $S=\frac{1}{n}\sum\limits_{i=1}^n\hat{U}_i^2(\theta_q)$. From (2.6), we have
where we have used $0<1+\lambda(\theta_q)\hat{U}_i(\theta_q)\leq 1+|\lambda(\theta_q)|\Lambda^{*}$, which yields from
Therefore, $|\lambda(\theta_q)|(S-\alpha\bar{\Lambda}\Lambda^{*})\leq|\alpha\bar{\Lambda}|$. By Lemma 3.1 and Lemma 3.2, we know $\Lambda^{*}=o_p(n^{1/2})$ and $\Lambda=O_p(n^{-1/2})$, we have $ |\lambda(\theta_q)|(S+o_p(1))\leq|\alpha\bar{\Lambda}|. $ Lemma 3.2 implies that $S=\Sigma+o_p(1)$, hence $|\lambda(\theta_q)|=O_p(n^{-1/2})$. This completes the proof.
Proof of Theorem 2.1 Applying a Taylor expansion to equation (2.6) and (2.7), we can obtain
where
From (2.6), we can get
From Lemma 3.2 and Lemma 3.3, by simple calculation, we have
It follows that
Furthermore, from (3.6), we can get that
By (3.5), (3.6) and (3.7), we have