数学杂志  2018, Vol. 38 Issue (5): 793-803   PDF    
扩展功能
加入收藏夹
复制引文信息
加入引用管理器
Email Alert
RSS
本文作者相关文章
HU Gui-kai
XIONG Peng-fei
WANG Tong-xin
BAYES PREDICTION OF POPULATION QUANTITIES IN A FINITE POPULATION
HU Gui-kai, XIONG Peng-fei, WANG Tong-xin    
School of Science, East China University of Technology, Nanchang 330013, China
Abstract: In this paper, we investigate the prediction in a finite population with the normal inverse-Gamma prior under the squared error loss. First, we obtain Bayes prediction of linear quantities and quadratic quantities based on Bayesian theory, respectively. Second, we compare Bayes prediction with the best linear unbiased prediction of linear quantities according to statistical decision theory, which shows that Bayes prediction is better than the best linear unbiased prediction.
Key words: Bayes prediction     linear quantities     quadratic quantities     finite populations    
有限总体中总体数量的贝叶斯预测
胡桂开, 熊鹏飞, 王同心    
东华理工大学理学院, 江西 南昌 330013
摘要:本文在误差平方损失下研究了具有正态逆伽马先验信息有限总体中的预测问题.首先,基于贝叶斯思想分别获得了线性数量和二次型数量的贝叶斯预测;其次,利用统计决策理论对线性数量的贝叶斯预测和最佳线性无偏预测进行了比较.结果表明在预测均方误差下线性数量的贝叶斯预测一致优于最佳线性无偏预测.
关键词贝叶斯预测    线性数量    二次型数量    有限总体    
1 Introduction

Let $\mathscr{P}=\{1, \cdots, N\}$ denote a finite population of $N$ identifiable units, where $N$ is known. Associated with the $i$th unit of $\mathscr{P}$, there are $p+1$ quantities: $y_{i}, x_{i1}, \cdots, x_{ip}$, where all but $y_{i}$ are known, $i=1, \cdots , N$. Let $y=(y_{1}, \cdots, y_{N})'$ and $X=(X_{1}, \cdots, X_{N})'$, where $X_{i}=(x_{i1}, \cdots, x_{ip})'$, $i=1, \cdots , N$. Relating the two sets of variables, we consider the linear model

$ \begin{equation} y=X\beta+\varepsilon, \varepsilon\sim N_{N}(0, \sigma^{2}V), \end{equation} $ (1.1)

where $\beta$ is a $p\times 1$ unknown parameter vector, $V$ is a known symmetric positive definite matrix, but the parameter $\sigma^{2}>0$ is unknown.

For the superpopulation model (1.1), it is interesting to study the optimal prediction of the population quantity $\theta(y)$ such as the population Total $T=\sum\limits_{i=1}^{N}y_{i}$, the population variance $S_{y}^{2}=\sum\limits_{i=1}^{N}(y_{i}-\bar{y}_{N})^{2}/N$, where $\bar{y}_{N}=T/N$ is the population mean and the finite population regression coefficient $\beta_{N}=(X'V^{-1}X)^{-1}X'V^{-1}y$, and so on. In the literature, a lot of predictions for the population quantities were produced. For example, Bolfarine and Rodrigues [1] gave the simple projection predictor, and obtained necessary and sufficient conditions for it to be optimal. Bolfarine et al. [2] studied the best unbiased prediction of finite population regression coefficient under the generalized prediction mean squared error in different kinds of models. Xu et al. [3] obtained a kind of optimal prediction of linear predictable function, and got the necessary and sufficient conditions for any linear prediction to be optimal under matrix loss. Xu and Yu [4] further gave the admissible prediction in superpopulation models with random regression coefficients under matrix loss function. Hu and Peng [5] obtained some conditions for linear prediction to be admissible in superpopulation models with and without the assumption that the underlying distribution is normal, respectively. Furthermore, Hu et al. [6-7] discussed the linear minimax prediction in the multivariate normal populations and Gauss-Markov populations, respectively. Their results showed that linear minimax prediction for finite population regression coefficient is admissible in some conditions. Bolfarine and Zacks [8] studied Bayes and minimax prediction under square error loss function in a finite population with single parametric prior. Meanwhile, Bansal and Aggarwal [9-11] considered Bayes prediction of finite population regression coefficient using a balanced loss function under the same prior information. There are two characteristics in the above studies.

On the one hand, they obtained the optimal, linear admissible and minimax predictions based on statistical decision theory. It is well known that statistical decision theory only consider the sample information and loss function and do not consider the prior information. However, people usually have these information.

On the other hand, they discussed the Bayes prediction by considering the prior information of single parameter, and did not consider the situation of multi-parameters. In other words, they only made use of the prior information of regression coefficient, but not use the prior information of error variance in model (1.1). In fact, multi-parameter situations are often encountered in the practical problems. Therefore, in this paper, we will study Bayes prediction of linear and quadratic quantities in a finite population where regression coefficient and error variance have the normal inverse-Gamma prior.

Assume that the prior distribution of $\beta$ and $\sigma^{2}$ is normal inverse-Gamma distribution, that is,

$ \begin{equation} \beta\mid\sigma^{2}\sim N_{p}(\mu, \frac{\sigma^{2}}{k}I_{p}), \sigma^{2}\sim \Gamma^{-1}(\frac{\alpha}{2}, \frac{\lambda}{2}), \end{equation} $ (1.2)

where $\mu$ is a known $p\times 1$ vector, $\alpha$ and $\lambda$ are known constants, $k^{-1}$ is a ratio between the prior variance of $\beta$ and sample variance of model (1.1). We can suppose that $k^{-1}$ is known by experience or professional knowledge. Therefore, the joint prior distribution of $(\beta, \sigma^{2})$ is

$ \begin{eqnarray} \pi(\beta, \sigma^{2})&=& p_{1}(\beta|\sigma^{2})p_{2}(\sigma^{2}\nonumber)\\ &=& M_{1}(\sigma^{2})^{-(\frac{p+\alpha}{2}+1)}\exp\{-\frac{1}{2\sigma^{2}}[k(\beta-\mu)'(\beta-\mu)+\lambda]\}, \end{eqnarray} $ (1.3)

where $M_{1}=(\frac{k}{2\pi})^{\frac{p}{2}}(\frac{\lambda}{2})^{\frac{\alpha}{2}}[\Gamma(\frac{\alpha}{2})]^{-1}$. The Bayes model defined by (1.1) and (1.2) is designated by (1.4). In order to obtain Bayes prediction in the Bayes model (1.4), a sample $\mathscr{S}$ of size $n$ is selected from $\mathscr{P}$ according to some specified sampling plan. Let $\mathscr{R}=\mathscr{P}-\mathscr{S}$ be the unobserved part of $\mathscr{P}$ of size $N-n$. After the sample $\mathscr{S}$ has been selected, we may reorder the elements of $y$ such that we have the corresponding partitions of $y$, $X$ and $V$, that is

$ \begin{equation*} y=\left(\begin{array}{c}y_{s}\\ y_{r}\end{array}\right), X=\left(\begin{array}{c}X_{s}\\ X_{r}\end{array}\right), V=\left(\begin{array}{cc}V_{s}&V_{sr}\\ V_{rs}&V_{r}\end{array}\right), \end{equation*} $

where $X$ and $X_{s}$ are known column full rank matrices.

The rest of this paper is organized as follows: in Section 2, we give Bayes predictor of population quantities in the Bayes model (1.4). Section 3 is devoted to discuss Bayesian prediction of linear quantities. In Section 4, we obtain Bayes prediction of quadratic quantities. Some examples are given in Section 5. Concluding remarks are placed in Section 6.

2 Bayes Prediction of Population Quantities

In this section, we will discuss the Bayes prediction of population quantities. Let $L(\hat{\theta}(y_{s}), \theta(y))$ be a loss function for predicting $\theta(y)$ by $\hat{\theta}(y_{s})$. The corresponding Bayes prediction risk of $\hat{\theta}(y_{s})$ in model (1.4) is defined as $\rho(\hat{\theta}(y_{s}), \theta(y))=E_{y}[L(\hat{\theta}(y_{s}), \theta(y))]$, where the expectation operator $E_{y}$ is performed with respect to the joint distribution of $y$ and $(\beta, \sigma^{2})$. The Bayes predictor is the one minimizing the Bayes prediction risk $\rho(\hat{\theta}(y_{s}), \theta(y))$. In particular, when we consider the squared error loss, then the Bayes prediciton of $\theta(y)$ is

$ \begin{equation} \hat{\theta}(y_{s})=E_{y}[\theta(y)|y_{s}], \end{equation} $ (2.1)

and the Bayes prediction risk is

$ \begin{equation} \varrho(\hat{\theta}(y_{s}), \theta(y))=E_{y_{s}}\{{\rm Var}[\theta(y)|y_{s}]\}, \end{equation} $ (2.2)

where the expectation operator $E_{y_{s}}$ is performed with respect to the joint distribution of $y_{s}$ and $(\beta, \sigma^{2})$. It is noted that $y_{s}|\beta, \sigma^{2}\sim N_{n}(X_{s}\beta, \sigma^{2}V_{s})$ and

$ y_{r}-V_{rs}V_{s}^{-1}y_{s}|\beta, \sigma^{2}\sim N_{N-n}((X_{r}-V_{rs}V_{s}^{-1}X_{s})\beta, \sigma^{2}(V_{r}-V_{rs}V_{s}^{-1}V_{sr})). $

This together with eq. (1.3) will yield the following results.

Theorem 2.1 Under the Bayes model (1.4), the following results hold.

(ⅰ) The joint posterior probability density of $(\beta, \sigma^{2})$ is

$ \begin{equation} \pi(\beta, \sigma^{2}|y_{s})=M_{2}|\Sigma|^{-\frac{1}{2}}(\sigma^{2})^{-\frac{n+p+\alpha}{2}+1} \exp\{-\frac{1}{2\sigma^{2}}[c_{0}+(\beta-\tilde{\beta}_{s})'\Sigma^{-1}(\beta-\tilde{\beta}_{s})]\}. \end{equation} $ (2.3)

(ⅱ) The marginal posterior distribution of $\beta$ is $p$-dimensional $t$ distribution $MT_{p}(\tilde{\beta}_{s}, $ $ \frac{c_{0}\Sigma}{n+\alpha}, n+\alpha)$ with probability density

$ \pi(\beta|y_{s})=M_{3}|\frac{c_{0}\Sigma}{n+\alpha}|^{-\frac{1}{2}} [1+\frac{1}{n+\alpha}(\beta-\tilde{\beta}_{s})'(\frac{c_{0}\Sigma}{n+\alpha})^{-1}(\beta-\tilde{\beta}_{s})]^{-\frac{n+\alpha+p}{2}}. $

(ⅲ) The marginal posterior distribution of $\sigma^{2}$ is $\Gamma^{-1}(\frac{n+\alpha}{2}, \frac{c_{0}}{2})$ with probability density

$ \pi(\sigma^{2}|y_{s})=M_{4}(\sigma^{2})^{-\frac{n+\alpha}{2}+1}\exp(-\frac{c_{0}}{2\sigma^{2}}). $

(ⅳ) Bayes prediction distribution of $y_{r}$ given $y_{s}$ is $N-n$ dimensional $t$ distribution $MT_{N-n}(\hat{y_{r}}, \frac{c_{0}U}{n+\alpha}, n+\alpha)$ with probability density

$ \pi(y_{r}|y_{s})=M_{5}|\frac{c_{0}U}{n+\alpha}|^{-\frac{1}{2}}[1+\frac{1}{n+\alpha}(y_{r}-\hat{y_{r}})'(\frac{c_{0}U}{n+\alpha})^{-1}(y_{r}-\hat{y_{r}})]^{-\frac{N+r}{2}}, $

where

$ \begin{array}{l} \hat{\beta}_{s}=(X_{s}'V_{s}^{-1}X_{s})^{-1}X_{s}'V_{s}^{-1}y_{s}, \hat{\sigma}^{2}=\frac{(y_{s}-X_{s}\hat{\beta}_{s})'V_{s}^{-1}(y_{s}-X_{s}\hat{\beta}_{s})}{n-p}, \\ c_{0}=(\hat{\beta}_{s}-\mu)'A(\hat{\beta}_{s}-\mu)+(n-p)\hat{\sigma}^{2}+\lambda, \tilde{\beta}_{s}=\Sigma[k\mu+(X_{s}'V_{s}^{-1}X_{s})\hat{\beta}_{s}], \\ A=[(X_{s}'V_{s}^{-1}X_{s})^{-1}+k^{-1}I_{p}]^{-1}, \\ \Sigma=(X_{s}'V_{s}^{-1}X_{s}+kI_{p})^{-1}, M_{2}=(2\pi)^{-\frac{p}{2}}M_{4}, M_{3}=\frac{\Gamma(\frac{ n+\alpha+p}{2})}{\Gamma(\frac{n+\alpha}{2})}[\pi(n+\alpha)]^{-\frac{p}{2}}, \\ M_{4}=\frac{(\frac{c_{0}}{2})^{n+\alpha}}{\Gamma(\frac{n+\alpha}{2})}, M_{5}=\frac{\Gamma(\frac{ N+\alpha}{2})}{\Gamma(\frac{n+\alpha}{2})}[\pi(n+\alpha)]^{-\frac{N+\alpha}{2}}, \\ \tilde{y_{r}}=X_{r}\tilde{\beta}_{s}+V_{rs}V_{s}^{-1}(y_{s}-X_{s}\tilde{\beta}_{s}), \\ U=V_{r}-V_{rs}V_{s}^{-1}V_{sr}+(X_{r}-V_{rs}V_{s}^{-1}X_{s})\Sigma(X_{r}-V_{rs}V_{s}^{-1}X_{s})'. \end{array} $

Proof The proof of (ⅰ): since

$ \begin{array}{l} (y_{s}-X_{s}\beta)'V_{s}^{-1}(y_{s}-X_{s}\beta)\\ = (y_{s}-X_{s}\hat{\beta}_{s})'V_{s}^{-1}(y_{s}-X_{s}\hat{\beta}_{s}) +(X_{s}\hat{\beta}_{s}-X_{s}\beta)'V_{s}^{-1}(X_{s}\hat{\beta}_{s}-X_{s}\beta)\\ = (n-p)\hat{\sigma}^{2}+(\hat{\beta}_{s}-\beta)'X_{s}'V_{s}^{-1}X_{s}(\hat{\beta}_{s}-\beta) \end{array} $

and $y_{s}|\beta, \sigma^{2}\sim N_{n}(X_{s}\beta, \sigma^{2}V_{s})$, the conditional probability density of $y_{s}$ given $(\beta, \sigma^{2})$ is

$ \begin{equation*} p_{3}(y_{s}|\beta, \sigma^{2})=(2\pi\sigma^{2})^{-\frac{n}{2}}\exp\{-\frac{(n-p)\hat{\sigma}^{2}+(\hat{\beta}_{s}-\beta)'X_{s}'V_{s}^{-1}X_{s}(\hat{\beta}_{s}-\beta)}{2\sigma^{2}}\}. \end{equation*} $

This together with eq. (1.3) will yield that the joint posterior probability density of $(\beta, \sigma^{2})$ is

$ \begin{array}{l} \pi(\beta, \sigma^{2}|y_{s}) = \frac{p_{3}(y_{s}|\beta, \sigma^{2})\pi(\beta, \sigma^{2})}{m(y_{s})}\\ \propto p_{3}(y_{s}|\beta, \sigma^{2})\pi(\beta, \sigma^{2})\\ \propto (\sigma^{2})^{-(\frac{n+\alpha+p}{2}+1)}\exp\{-\frac{1}{2\sigma^{2}}[(n-p)\hat{\sigma}^{2}+(\beta-\hat{\beta}_{s})'X_{s}'V_{s}^{-1}X_{s}(\beta-\hat{\beta}_{s})]\}\\ \exp\{-\frac{1}{2\sigma^{2}}[k(\beta-\mu)'(\beta-\mu)+\lambda]\}\\ \propto (\sigma^{2})^{-(\frac{n+\alpha+p}{2}+1)}\exp\{-\frac{1}{2\sigma^{2}}[c_{0}+(\beta-\tilde{\beta}_{s})'\Sigma^{-1}(\beta-\tilde{\beta}_{s})]\}, \end{array} $

where $m(y_{s})$ is the marginal probability density of $y_{s}$, symbol $\propto$ denotes proportional to. By adding the regularization constant $M_{2}|\Sigma|^{-\frac{1}{2}}$ to eq. (2.3), we obtain result (ⅰ).

The proof of (ⅱ): by the integral of eq. (2.2) about $\sigma^{2}$, we have

$ \begin{eqnarray*} \pi(\beta|y_{s})&=&\int_{0}^{+\infty}\pi(\beta, \sigma^{2}|y_{s})d\sigma^{2}\\ &=&M_{2}|\Sigma|^{-\frac{1}{2}}\int_{0}^{+\infty}(\sigma^{2})^{-(\frac{n+p+\alpha}{2}+1)}\exp\{-\frac{1}{2\sigma^{2}} [c_{0}+(\beta-\tilde{\beta})'\Sigma^{-1}(\beta-\tilde{\beta})]\}d\sigma^{2}\\ &=&M_{3}|\frac{c_{0}\Sigma}{n+\alpha}|^{-\frac{1}{2}} [1+\frac{1}{n+\alpha}(\beta-\tilde{\beta}_{s})'(\frac{c_{0}\Sigma}{n+\alpha})^{-1}(\beta-\tilde{\beta}_{s})]^{-\frac{n+\alpha+p}{2}}, \end{eqnarray*} $

which implies that the marginal posterior distribution of $\beta$ is $p$-dimensional $t$ distribution with mean vector $\tilde{\beta}$, correlation matrix $\frac{c_{0}\Sigma}{n+\alpha}$ and degrees of freedom $n+\alpha$.

The proof of (ⅲ): by the integral of eq. (2.2) about $\beta$, we can obtain the result. Here it is omitted.

The proof of (ⅳ): by $y_{s}|\beta, \sigma^{2}\sim N_{n}(X_{s}\beta, \sigma^{2}V_{s})$, $y_{r}|\beta, \sigma^{2}, y_{s}\sim N_{N-n}(X_{r}\beta+V_{rs}V_{s}^{-1}(y_{s}-X_{s}\beta), \sigma^{2}(V_{r}-V_{rs}V_{s}^{-1}V_{sr}))$, and eq. (2.2), we know that

$ \begin{array}{l} \pi(y_{r}, \beta, \sigma^{2}| y_{s}) \propto p_{3}(y_{s}|\beta, \sigma^{2})\pi(\beta, \sigma^{2})p_{4}(y_{r}|\beta, \sigma^{2}, y_{s})\\ \propto (\sigma^{2})^{-(\frac{N+p+\alpha}{2}+1)}\exp\{-\frac{1}{2\sigma^{2}}[c_{0}+(\beta-\tilde{\beta})'\Sigma^{-1}(\beta-\tilde{\beta})]\}\\ \times \exp\{-\frac{1}{2\sigma^{2}}[y_{r}-X_{r}\beta-V_{rs}V_{s}^{-1}(y_{s}-X_{s}\beta)]'(V_{r}-V_{rs}V_{s}^{-1}V_{sr})^{-1}\\ [y_{r}-X_{r}\beta-V_{rs}V_{s}^{-1}(y_{s}-X_{s}\beta)]\}\\ \propto (\sigma^{2})^{-(\frac{N+p+\alpha}{2}+1)}\exp\{-\frac{1}{2\sigma^{2}}[(\beta-D^{-1}\beta^{*})'D(\beta-D^{-1}\beta^{*})+c_{0}+(y_{r}-\tilde{y_{r}})'U(y_{r}-\tilde{y_{r}})]\}, \end{array} $

where

$ \begin{array}{l} \beta^{*}=\Sigma^{-1}\tilde{\beta}_{s}+(X_{r}-V_{rs}V_{s}^{-1}X_{s})'(V_{r}-V_{rs}V_{s}^{-1}V_{sr})^{-1}(y_{r}-V_{rs}V_{s}^{-1}y_{s}), \\ D=\Sigma^{-1}+(X_{r}-V_{rs}V_{s}^{-1}X_{s})'(V_{r}-V_{rs}V_{s}^{-1}V_{sr})^{-1}(X_{r}-V_{rs}V_{s}^{-1}X_{s}).\end{array} $

Adding the regularization constant to eq. (2.3) and integrating it by $\beta$ and $\sigma^{2}$, respectively, we can obtain the result.

3 Bayes Prediction of Linear Quantities

In order to obtain Bayes prediction of $\theta(y)$, we consider the squared error loss

$ \begin{equation} L(\hat{\theta}(y_{s}), \theta(y))=[\hat{\theta}(y_{s})-\theta(y)]^{2}, \end{equation} $ (3.1)

then Bayes prediciton of $\theta(y)$ is

$ \begin{equation} \hat{\theta}(y_{s})=E_{y}[\theta(y)|y_{s}], \end{equation} $ (3.2)

and Bayes prediction risk is

$ \begin{equation} \varrho(\hat{\theta}(y_{s}), \theta(y))=E_{y}[\hat{\theta}(y_{s})-\theta(y)]^{2}=E_{y_{s}}\{{\rm Var}[\theta(y)|y_{s}]\}, \end{equation} $ (3.3)

where the expectation operator $E_{y_{s}}$ is performed with respect to the joint distribution of $y_{s}$ and $(\beta, \sigma^{2})$. By result (ⅳ) of Theorem 2.1, we know

$ \begin{equation} E_{y}(y_{r}|y_{s})=X_{r}\tilde{\beta}_{s}+V_{rs}V_{s}^{-1}(y_{s}-X_{s}\tilde{\beta}_{s}) \end{equation} $ (3.4)

and

$ \begin{equation} {\rm Var}(y_{r}|y_{s})=\frac{c_{0}}{n+\alpha-2}U. \end{equation} $ (3.5)

Now, let $\theta(y)=Qy$ be any linear quantity, where $Q=(Q_{s}', Q_{r}')$ is a known $1\times N$ vector. According to Theorem 2.1, eqs. (3.4) and (3.5), we have the following conclusions.

Theorem 3.1 Under model (1.4) and squared error loss function, Bayes predictor of linear quantity $Qy$ is $\tilde{\theta}(y_{s})=Q_{s}'y_{s}+Q_{r}'\tilde{y}_{r}$, and Bayes predictor risk is $\frac{E_{y_{s}}(c_{0})}{n+\alpha-2}Q_{r}'UQ_{r}$.

As we know, the best linear unbiased prediction of $Qy$ under the squared error loss is $\hat{\theta}(y_{s})$, where $\hat{\theta}(y_{s})=Q_{s}'y_{s}+Q_{r}'\hat{y}_{r}$, and $ \hat{y}_{r}=X_{r}\hat{\beta}_{s}+V_{rs}V_{s}^{-1}(y_{s}-X_{s}\hat{\beta}_{s})$. In the following, we will discuss the superiority between Bayes prediction and the best linear unbiased prediction under the predicative mean squared error (PMSE), which is defined by ${\rm PMSE}(d(y_{s}), Qy)=E[(d(y_{s})-Qy)^{2}].$

Theorem 3.2 Under model (1.4), Bayes prediction $\tilde{\theta}(y_{s})$ of $Qy$ is better than the best linear unbiased prediction $\hat{\theta}(y_{s})$ under the predicative mean squared error.

Proof By the definition of PMSE and $\tilde{\beta}_{s}=\hat{\beta}_{s}-k\Sigma(\hat{\beta}_{s}-\mu)$, we have

$ \begin{eqnarray*} &&{\rm PMSE}(\tilde{\theta}(y_{s}), Qy)\\ &=&E[(\tilde{\theta}(y_{s})-Qy)^{2}]\\ &=&E[Q_{r}'(\tilde{y}_{r}-y_{r})(\tilde{y}_{r}-y_{r})'Q_{r}]\\ &=&E[Q_{r}'(X_{r}\tilde{\beta}_{s}+V_{rs}V_{s}^{-1}(y_{s}-X_{s}\tilde{\beta}_{s})-y_{r})(X_{r}\tilde{\beta}_{s}+V_{rs}V_{s}^{-1}(y_{s}-X_{s}\tilde{\beta}_{s})-y_{r})'Q_{r}]\\ &=&Q_{r}'E_{y_{s}}[(X_{r}-V_{rs}V_{s}^{-1}X_{s})(\tilde{\beta}_{s}-\beta)(\tilde{\beta}_{s}-\beta)'(X_{r}-V_{rs}V_{s}^{-1}X_{s})'+\sigma^{2}(V_{r}-V_{rs}V_{s}^{-1}V_{sr})]Q_{r}\\ &=&Q_{r}'(X_{r}-V_{rs}V_{s}^{-1}X_{s})E_{y_{s}}[(\hat{\beta}_{s}-\beta)-k\Sigma(\hat{\beta}_{s}-\mu)][(\hat{\beta}_{s}-\beta) -k\Sigma(\hat{\beta}_{s}-\mu)]'\\ &&(X_{r}-V_{rs}V_{s}^{-1}X_{s})'Q_{r}+\frac{\lambda}{\alpha-2}Q_{r}'(V_{r}-V_{rs}V_{s}^{-1}V_{sr})Q_{r}\\ &=&{\rm PMSE}(\hat{\theta}(y_{s}), Qy)\\ &&-\frac{k^{2}\lambda}{\alpha-2}Q_{r}'(X_{r}-V_{rs}V_{s}^{-1}X_{s})\Sigma(k^{-1}I_{p}+(X_{s}'V_{s}^{-1}X_{s})^{-1})\Sigma'(X_{r}-V_{rs}V_{s}^{-1}X_{s})'Q_{r}. \end{eqnarray*} $

That is, ${\rm PMSE}(\hat{\theta}(y_{s}), Qy)-{\rm PMSE}(\tilde{\theta}(y_{s}), Qy)> 0$. Therefore, $\tilde{\theta}(y_{s})$ is better than $\hat{\theta}(y_{s})$ under the predicative mean squared error.

Corollary 3.1 Bayes predictor of the population total $T$ under model (1.4) and the loss function (3.1) is $\tilde{T}(y_{s})=1_{n}'y_{s}+1_{N-n}'[X_{r}\tilde{\beta}_{s}+V_{rs}V_{s}^{-1}(y_{s}-X_{s}\tilde{\beta}_{s})]$, and Bayes risk of this predictor is $\frac{E_{y_{s}}(c_{0})}{n+\alpha-2}1_{N-n}'U1_{N-n}$. Moreover, $\hat{T}(y_{s})$ is dominated by $\tilde{T}(y_{s})$ under the predicative mean squared error, where $\hat{T}(y_{s})=1_{n}'y_{s}+1_{N-n}'\hat{y}_{r}$.

For the finite population regression coefficient $\beta_{N}=(X'V^{-1}X)^{-1}X'V^{-1}y$, following Bolfarine et al. [2], we can write it as

$ \begin{eqnarray*} \beta_{N}&=&(X'V^{-1}X)^{-1}X'V^{-1}y\\ &=&\bigg[(X_{s}'~~ X_{r}')\left(\begin{array}{cc}V_{s}&V_{sr}\\ V_{rs}&V_{r}\end{array}\right)^{-1}\left(\begin{array}{c}X_{s}\\ X_{r}\end{array}\right)\bigg]^{-1}(X_{s}'~~ X_{r}')\left(\begin{array}{cc}V_{s}&V_{sr}\nonumber\\ V_{rs}&V_{r}\end{array}\right)^{-1}\left(\begin{array}{c}y_{s}\\ y_{r}\end{array}\right)\\ &=&K_{s}y_{s}+K_{r}y_{r}, \end{eqnarray*} $

where

$ \begin{eqnarray*} &&K_{s}=G^{-1}JC^{-1}, ~~ K_{r}=G^{-1}FE^{-1}, ~~J=X_{s}'-X_{r}'V_{r}^{-1}V_{rs}, \\ && C=V_{s}-V_{sr}V_{r}^{-1}V_{rs}, ~~ F=X_{r}'-X_{s}'V_{s}^{-1}V_{sr}, ~~ E=V_{r}-V_{rs}V_{s}^{-1}V_{sr}, \end{eqnarray*} $

and

$ G=JC^{-1}X_{s}+FE^{-1}X_{r}. $

Then by Theorem 3.1, we have the following corollary.

Corollary 3.2 Bayes predictor of the population total $\beta_{N}$ under model (1.4) and the loss function (3.1) is $\tilde{\beta}_{N}(y_{s})=K_{s}y_{s}+K_{r}E(y_{r}|y_{s})$, and Bayes risk of this predictor is $\frac{E_{y_{s}}(c_{0})}{n+\alpha-2}K_{r}UK_{r}'$. Moreover, it is better than $\hat{\beta}_{N}(y_{s})$ under the predicative mean squared error, where $\hat{\beta}_{N}(y_{s})=K_{s}y_{s}+K_{r}\hat{y}_{r}$.

In order to illustrate our results, we give the following example.

Example 3.1 Let $X=(x_{1}, x_{2}, \cdots, x_{N})'$, $V={\rm diag}(x_{1}, x_{2}, \cdots, x_{N})$ in the Bayesian model (1.4), where $x_{i}\neq 0, i=1, 2, \cdots, N$. If $X_{s}=(x_{1}, x_{2}, \cdots, x_{n})', y_{s}=(y_{1}, y_{2}, \cdots, y_{n})'$, we have $\tilde{\beta}_{s}=\frac{1}{\sum\limits_{i=1}^{n}x_{i}+k}(k\mu+1_{n}'y_{s})$, $\hat{\beta}_{s}=\frac{1}{\sum\limits_{i=1}^{n}x_{i}}1_{n}'y_{s}$. According to Theorem 3.1, we have the following conclusions.

(ⅰ) $\tilde{T}(y_{s})=1_{n}'y_{s}+\frac{\sum\limits_{i=n+1}^{N}x_{i}}{\sum\limits_{i=1}^{n}x_{i}+k}(k\mu+1_{n}'y_{s}).$ Its Bayes prediction risk is $\frac{\lambda(\sum\limits_{i=1}^{N}x_{i}+k)\sum\limits_{i=n+1}^{N}x_{i}}{(\alpha-2)(\sum\limits_{i=1}^{n}x_{i}+k)}$. Moreover, $\tilde{T}(y_{s})$ is better than $\hat{T}(y_{s})$.

(ⅱ) $\tilde{\beta}_{N}(y_{s})=\frac{1}{\sum\limits_{i=1}^{N}x_{i}}\hat{T}(y_{s})$, and its Bayes prediction risk is $\frac{\lambda(\sum\limits_{i=1}^{N}x_{i}+k)\sum\limits_{i=n+1}^{N}x_{i}}{(\alpha-2)(\sum\limits_{i=1}^{n}x_{i}+k)(\sum\limits_{i=1}^{N}x_{i})^{2}}$. Moreover, $\tilde{\beta}_{N}(y_{s})$ is better than $\hat{\beta}_{N}(y_{s})$.

In the following, we continue to give the simulation study to explain our results according to the following steps, which are executed on a personal computer using Version 7.9 (R2009b) Matlab software.

(ⅰ) Generating randomly a $N\times p$ full column rank matrix X and a $p$-dimensional vector $\mu$;

(ⅱ) The number $\sigma^{2}$ and random error $\varepsilon$ are generated from distribution $\Gamma^{-1}(\frac{\alpha}{2}, \frac{\lambda}{2})$ and $N(0, \sigma^{2}V)$, respectively;

(ⅲ) Generating a $p$-dimensional vector $\beta$ by the distribution $N(\mu, \frac{\sigma^{2}}{k}I_{p})$;

(ⅳ) Obtaining the dependent variable $y$ by the model $y=X\beta+\varepsilon$.

(ⅴ) Generating randomly $N$-dimensional vector $Q$, then Bayes prediction and the best linear unbiased prediction of $Qy$ are derived by Theorem 3.1, respectively.

(ⅵ) Finally, we compare the PMSE between Bayes prediction and best linear unbiased prediction.

Now, we assume that $N=10, n=6, p=3, \alpha=8, \lambda=12, k=10$, and obtain the above data. The simulation study shows that Bayes prediction is better than the best linear unbiased prediction, which is consistent to our theoretical conclusions. Here, we give the above data in one experiment as following.

$ \begin{eqnarray*} &&X=\left(\begin{array}{ccc}0.7079 &-0.6014 &-2.3252\\ 1.9574& 0.5512&-1.2316\\0.5045&-1.0998&1.0556\\1.8645&0.0860&-0.1132\\ -0.3398&-2.0046&0.3792\\ -1.1398& -0.4931& 0.9442\\-0.2111&0.4620&-2.1204\\ 1.1902&-0.3210&-0.6447\\ -1.1162&1.2366&-0.7043\\ 0.6353& -0.6313& -1.0181\end{array}\right), ~~ \beta=\left(\begin{array}{c} -1.3868\\ 0.3785\\ 1.7166 \end{array}\right), \\ && \varepsilon=\left(\begin{array}{c} 0.6693\\ 0.3681\\0.6319\\0.2148\\0.3615\\0.1250 \\0.6272\\ 0.3941\\ 0.4814\\ 0.6497 \end{array}\right), ~~ y=\left(\begin{array}{c} -4.5316\\-4.2521\\1.3281\\-2.5328\\0.7251\\3.1399\\-2.5452\\-2.4847\\1.2884\\-2.2180 \end{array}\right). \end{eqnarray*} $

At this time, we get randomly

$ Q=(0.3139, 0.6382, 0.9866, 0.5029, 0.9477, 0.8280, 0.9176, 0.1131, 0.8121, 0.9083)'. $

By direct computation, we have $Qy=-4.3971$. By Theorem 2.1, we know $\tilde{\theta}(y_{s})=-4.8497, \hat{\theta}(y_{s})=-5.7928$, and ${\rm PMSE}(\hat{\theta}(y_{s}))-{\rm PMSE}(\tilde{\theta}(y_{s}))= 0.0844>0$. Therefore, Bayes prediction of $Qy$ is better than the best linear unbiased predictor.

4 Bayes Prediction of Quadratic Quantities

In this section, we will discuss Bayes prediction of quadratic quantities $f(H)=y'Hy$, where $H$ is a known symmetric matrix. Assume that $H=\left(\begin{array}{cc}H_{11}&H_{12}\\ H_{21}&H_{22}\end{array}\right)$ with $H_{12}=H_{21}'$, then

$ f(H)=(y_{s}', y_{r}')\left(\begin{array}{cc}H_{11}&H_{12}\\ H_{21}&H_{22}\end{array}\right)\left(\begin{array}{c}y_{s}\\ y_{r}\end{array}\right)=y_{s}'H_{11}y_{s}+y_{s}'H_{12}y_{r}+y_{r}'H_{21}y_{s}+y_{r}'H_{22}y_{r}. $

By Theorem 2.1 and eq. (3.2), we have the following results.

Theorem 4.1 Under model (1.4) and the loss function (1.3), the Bayes prediction of $f(H)$ is

$ \begin{eqnarray*}\hat{f}(H)&=&y_{s}'H_{11}y_{s}+y_{s}'H_{12}E(y_{r}|y_{s})+[E(y_{r}|y_{s})]'H_{21}y_{s}\\&&+[E(y_{r}|y_{s})]'H_{22}[E(y_{r}|y_{s})] +\mathit{\boldsymbol{tr}}[H_{22}{\rm Var}(y_{r}|y_{s})].\end{eqnarray*} $

For the population variance $S_{y}^{2}$, we know that

$ S_{y}^{2}=y'\frac{1}{N}(I_{N}-\frac{1}{N}1_{N}1_{N}')y=\frac{1}{N}(y_{s}', y_{r}')\left(\begin{array}{cc}I_{n}-\frac{1}{N}1_{n}1_{n}'&-\frac{1}{N}1_{n}1_{N-n}'\\ -\frac{1}{N}1_{N-n}1_{n}'& I_{N-n}-\frac{1}{N}1_{N-n}1_{N-n}'\end{array}\right)\left(\begin{array}{c}y_{s}\\ y_{r}\end{array}\right), $

where $1_{n}$ denotes $n$ dimensional vector with elements 1. Then by Theorem 4.1, we can obtain the following corollary.

Corollary 4.1 The Bayes prediction of the population variance $S_{y}^{2}$ under model (1.4) and the loss function (3.1) is

$ \begin{eqnarray*} \hat{S}_{y}^{2}&=&\frac{1}{N}(y_{s}', E(y_{r}|y_{s})')\left(\begin{array}{cc}I_{n}-\frac{1}{N}1_{n}1_{n}'&-\frac{1}{N}1_{n}1_{N-n}'\\ -\frac{1}{N}1_{N-n}1_{n}'& I_{N-n}-\frac{1}{N}1_{N-n}1_{N-n}'\end{array}\right)\left(\begin{array}{c}y_{s}\\ E(y_{r}|y_{s})\end{array}\right)\\ &&+\frac{1}{N}\mathit{\boldsymbol{tr}}[(I_{N-n}-\frac{1}{N}1_{N-n}1_{N-n}'){\rm Var}(y_{r}|y_{s})]. \end{eqnarray*} $

It is noted that $S_{y}^{2}=\frac{n}{N}S_{y_{s}}^{2}+(1-\frac{n}{N})[S_{y_{r}}^{2}+\frac{n}{N}(\bar{y}_{s}-\bar{y}_{r})^{2}]$, where $\bar{y}_{s}$ and $S_{y_{s}}^{2}$ are the mean and variance of $y_{s}$, $\bar{y}_{r}$ and $S_{y_{r}}^{2}$ are the mean and variance of $y_{r}$. Therefore, the Bayes prediction of the population variance can also be expressed as follows.

Remark 4.1 The Bayes prediction of the population variance $S_{y}^{2}$ under model (1.4) and the loss function (3.1) is

$ \begin{eqnarray*} \hat{S}_{y}^{2} &=&\frac{n}{N}S_{y_{s}}^{2}+(1-\frac{n}{N})\bigg\{\mathit{\boldsymbol{tr}}[\frac{1}{N-n}(I_{N-n}-\frac{1}{N-n}1_{N-n}1_{N-n}'){\rm Var}(y_{r}|y_{s})]\\ &&+E(y_{r}|y_{s})'\frac{1}{N-n}(I_{N-n}-\frac{1}{N-n}1_{N-n}1_{N-n}')E(y_{r}|y_{s})\\ &&+\frac{n}{N}[(\bar{y}_{s}-\frac{1}{N-n}1_{N-n}'E(y_{r}|y_{s}))^{2} +\frac{1}{(N-n)^{2}}1_{N-n}'{\rm Var}(y_{r}|y_{s})1_{N-n}]\bigg\} \end{eqnarray*} $

Proof Since $S_{y}^{2}=\frac{n}{N}S_{y_{s}}^{2}+(1-\frac{n}{N})[S_{y_{r}}^{2}+\frac{n}{N}(\bar{y}_{s}-\bar{y}_{r})^{2}]$, we only derive the Bayes prediction of $S_{y_{r}}^{2}+\frac{n}{N}(\bar{y}_{s}-\bar{y}_{r})^{2}$. Moreover, we know that $S_{y_{r}}^{2}=\frac{1}{N-n}y_{r}'(I_{N-n}-\frac{1}{N-n}1_{N-n}1_{N-n}')y_{r}, $ and $\bar{y}_{r}=\frac{1}{N-n}1_{N-n}'y_{r}$. Therefore, the Bayes prediction of $S_{y_{r}}^{2}$ is

$ \begin{eqnarray} E(S_{y_{r}}^{2}|y_{s})&=&\frac{1}{N-n}\mathit{\boldsymbol{tr}}[(I_{N-n}-\frac{1}{N-n}1_{N-n}1_{N-n}'){\rm Var}(y_{r}|y_{s})]\nonumber\\ &&+E(y_{r}|y_{s})'\frac{1}{N-n}(I_{N-n}-\frac{1}{N-n}1_{N-n}1_{N-n}')E(y_{r}|y_{s}). \end{eqnarray} $ (4.1)

And, the Bayes prediction of $(\bar{y}_{s}-\bar{y}_{r})^{2}$ is

$ \begin{equation} E[(\bar{y}_{s}-\bar{y}_{r})^{2}|y_{s}]=(\bar{y}_{s}-\frac{1}{N-n}1_{N-n}'E(y_{r}|y_{s}))^{2} +\frac{1}{(N-n)^{2}}1_{N-n}'{\rm Var}(y_{r}|y_{s})1_{N-n}. \end{equation} $ (4.2)

According to eqs.(4.1)-(4.2) and the expression of $S_{y}^{2}$, we can derive the result of this remark. It is easy to verify that the result of this remark is consistent to Corollary 4.1.

Example 4.1 Let $X=1_{N}, V=(1-\rho)I_{N}+\rho 1_{N}1_{N}'$ in the Bayesian model (1.4), where $\rho\in (0, 1)$ is known. It can be checked that $X_{s}'V_{s}^{-1}X_{s}=\frac{n}{1+(n-1)\rho}$, and

$ \begin{eqnarray*}&&\tilde{\beta}_{s}=(\frac{n}{1+(n-1)\rho}+k)^{-1}[k\mu+\frac{n}{1+(n-1)\rho}\bar{y}_{s}], \\ &&c_{0}=\frac{nk}{k+k\rho(n-1)+n}(\frac{1}{n}1_{n}'y_{s}-\mu)^{2}+\lambda+\frac{1}{1-\rho}y_{s}'(I_{n}-\frac{1}{n}1_{n}1_{n}')y_{s}.\end{eqnarray*} $

Then,

$ E(y_{r}|y_{s})=a 1_{N-n}~~ \mbox{and} ~~{\rm Var}(y_{r}|y_{s})=\frac{c_{0}}{n+\alpha-2}[(1-\rho)I_{N-n}+b 1_{N-n}1_{N-n}'], $

where $a=\frac{1-\rho}{1+(n-1)\rho}\tilde{\beta}_{s}+\frac{n\rho}{1+(n-1)\rho}\bar{y}_{s}$, $b=\frac{1-\rho}{1+(n-1)\rho}[\rho+\frac{1-\rho}{n+k+(n-1)k\rho}]$. According to Remark 4.1, we know that

$ \hat{S}_{y}^{2}=\frac{n}{N}S_{y_{s}}^{2}+\frac{(1-\rho)(N-n-1)c_{0}}{N(n+\alpha-2)}+\frac{(N-n)n}{N^{2}}[(\bar{y}_{s}-a)^{2}+\frac{c_{0}}{n+\alpha-2}(b+\frac{1-\rho}{N-n})]. $
5 Concluding Remarks

In this paper, we obtain Bayes prediction of linear and quadratic quantities in the finite population with normal inverse-Gamma prior information. In our studies, on the one hand, the distribution of the superpopulation model is need to be normal. However, in many occasions, the distribution of the model is usually unknown in addition to the mean vector and covariance matrix. At this time, how to deal with the Bayes prediction? On the other hand, if the prior distribution is hierarchical and improper, how to obtain the generalized Bayes prediction and discuss its optimal properties? Such as these problems are deserved to discuss in the future.

References
[1] Bolfarine H, Rodrigues J. On the simple projection predictor in finite populations[J]. Aust. J. Stat., 1988, 30: 338–341. DOI:10.1111/anzs.1988.30.issue-3
[2] Bolfarine H, Zacks S, Elian S N, Rodrigues J. Optimal prediction of the finite population regression coefficient[J]. Sankhyā. Ser. B., 1994, 56: 1–10.
[3] Xu L W, Lu M, Jiang C F. Optimal prediction in finite populations under matrix loss[J]. J. Stat. Plan. Infer., 2011, 141(8): 2503–2512. DOI:10.1016/j.jspi.2010.11.037
[4] Xu L W, Yu S H. Admissible prediction in superpopulation models with random regression coefficients under matrix loss function[J]. J. Multiv. Anal., 2012, 103(1): 68–76. DOI:10.1016/j.jmva.2011.06.008
[5] Hu G K, Peng P. Linear admissible predictor of finite population regression coefficient under a balanced loss function[J]. J. Math., 2014, 34: 820–828.
[6] Hu G K, Li Q G, Yu S H. Optimal and minimax prediction in multivariate normal populations under a balanced loss function[J]. J. Multiv. Anal., 2014, 128: 154–164. DOI:10.1016/j.jmva.2014.03.014
[7] Hu G K, Li Q G, Yu S H. Linear minimax prediction of finite population regression coefficient under a balanced loss function[J]. Commun. Stat-The. M., 2016, 45(24): 7197–7209. DOI:10.1080/03610926.2014.978945
[8] Bolfarine H, Zacks S. Bayes and minimax prediction in finite populations[J]. J. Stat. Plan. Infer., 1991, 28: 139–151. DOI:10.1016/0378-3758(91)90022-7
[9] Bansal A K, Aggarwal P. Bayes prediction for a heteroscedastic regression surperpopulation model using balanced loss function[J]. Commun. Stat-The. M., 2007, 36: 1565–1575. DOI:10.1080/03610920601125797
[10] Bansal A K, Aggarwal P. Bayes prediction of the regression coefficient in a finite population using balanced loss functionl[J]. Metron., 2009, 67: 1–16.
[11] Bansal A K, Aggarwal P. Bayes prediction for a stratified regression superpopulation model using balanced loss function[J]. Commun. Stat-The. M., 2010, 39: 2789–2799. DOI:10.1080/03610920903128911