In many regression models, specially the econometric models, for the purpose of identification, some distributional assumptions are often imposed on the error term. The assumptions are conditional moment restrictions, independence between observations, and conditional symmetry around zero given the independent variables. There were a few semiparametric estimators proposed under conditional symmetry. Manski [1] and Newey [2] estimated regression models under conditional symmetry. Powell [3] and Newey [4] proposed semiparametric estimations for Tobit models under conditional symmetry.
Despite the wide use of the property of conditional symmetry, tests for conditional symmetry were not addressed very much in the literature. The first tests were proposed by Powell [5] for censored regression models and by Newey and Powell [6] for linear regression models via asymmetric least squares estimation. However these tests are unlikely to be consistent against all conditional asymmetric distributions. Zheng [7] proposed a consistent test of conditional symmetry using a kernel method, but the test statistic contains integral term and is hard to implement. Bai and Ng [8] proposed an alternative test for conditional symmetry for time series models. The test relied on the correct specification of both conditional mean and conditional variance. Hyndman and Yao [9] developed a bootstrap test for the symmetry of conditional density functions based on their improved methods for conditional density estimation, but they didn't discuss the asymptotic properties of the test statistic, so it is not clear whether the test be consistent or not. Su [10] gave a simple consistent nonparametric test of conditional symmetry based on the principle of conditional characteristic functions, and he [11] also gave an unconditional method by transforming the conditional symmetry test problem to a unconditional test one. Both of the test statistics he presented in paper need a given characteristic function of the probability measure for the value space of the conditional variable.
In this paper, we propose a simple test for conditional symmetry based on the concept of conditional energy distance. The test is shown to be asymptotically normal under the null hypothesis of conditional symmetry and consistent against any conditional asymmetric distribution. Our test statistic only contains the Euclidean distances and kernel function, so it is easy to compute.
Székely [12] introduced a new concept named energy distance to measure the difference between two independent probability distributions. If $ X $ and $ Y $ are independent random vectors in $ \mathbb{R}^p $ with cumulative distribution functions (cdf) $ F $ and $ G $ respectively, then the energy distance between the distributions $ F $ and $ G $ is defined as
where $ X' $ is an i.i.d. copy of $ X $ and $ Y' $ is an i.i.d. copy of $ Y $, $ E $ is the expected value, and $ |\; .\; | $ denotes the Euclidean norm. One can also write $ \varepsilon(F, G) $ as $ \varepsilon(X, Y) $, and call it the energy distance of $ X $ and $ Y $. Székely [12] proved that for real-valued random variables this distance is exactly twice Harald Cramér's distance, that is
In higher dimensions, however, the two distances are different because the energy distance is rotation invariant while Cramér's distance is not. The equality becomes
where $ \phi_X(t) $ is $ X $'s characteristic function and $ \phi_Y(t) $ is $ Y $'s characteristic function, $ c_p = \frac{\pi^{(p+1)/2}}{\Gamma(\frac{p+1}{2})}. $ Thus $ \varepsilon(F, G)\geq 0 $ with equality to zero if and only if $ F = G $. This property makes it possible to use $ \varepsilon(F, G) $ for testing goodness-of-fit, homogeneity, etc. in a consistent way. We shall draw the consistent test statistic for conditional symmetry from the thought of energy distance.
Let $ X $ be a $ p $ dimensional random vector in Euclidean space $ \mathbb{R}^p $, $ Z $ be a $ r $ dimensional random vector in Euclidean space $ \mathbb{R}^r $. Denote $ f(x|z) $ as the conditional density function of $ X $ given $ Z $. Consider the hypothesis
where $ S(Z) $ denotes the support of the density function of $ Z $. Note that the null hypothesis (3) can be expressed equivalently as
Analogous to the concept of energy distance for two independent vectors, we can also define the conditional energy distance between $ X $ and $ -X $ given $ Z $ as follows.
Definition 2.1 The conditional energy distance $ \epsilon(X, -X|Z) $ between $ X $ and $ -X $ with finite first moment given $ Z $ is defined as the square root of
where $ \phi_{X|Z}(t) $ is the conditional characteristic function of $ X $ given $ Z $. Therefore $ H_0 $ holds if and only if $ \epsilon(X, -X|Z) = 0 $.
Let $ W_i = (X_i, Z_i), i = 1, 2, \cdots, n $ be a sample from the distribution of $ (X, Z) $ and denote $ \mathbf{W}=(\mathbf{X},\mathbf{Z})=\left\{ {{W}_{1}},{{W}_{2}},\ldots ,{{W}_{n}} \right\} $. Then for the specific value of $ \varepsilon^2(X, -X|Z) $ when given $ Z = z $, $ \epsilon(X, -X|Z) $ can be rewritten as the form of expectation by the following lemma.
Lemma 2.1 $ \varepsilon^2(X, -X|Z = z) $ can be rewritten as the form of
Therefore, $ X|Z = z\stackrel{D}{ = }-X|Z = z $ for any $ z $ if and only if
Proof Given the event $ Z = z $, we consider
According to the equation [12],
we have
Let
where $ f(Z) $ is the density function of $ Z $. Consequently, $ X|Z\stackrel{D}{ = }-X|Z $ if and only if $ \mathcal{S}_a = 0 $. Naturally, we can choose test statistic for $ H_0 $ as
where $ K_{ik} = K(H^{-1}(Z_i-Z_k)) $.
The test statistic $ \mathcal{U}_n $ has the advantage that it has zero mean under $ H_0 $ and hence it does not have a finite sample bias term. We show the consistent of $ \mathcal{U}_n $ and its asymptotical normality under $ H_0 $.
Here, we choose the Gaussian kernel
in $ \mathbb{R}^r $, where $ H $ is a diagonal matrix $ {\rm diag}\{h, h, \cdots, h\} $ determined by bandwidth $ h $. With the Gaussian kernel, $ \sum_i\omega_i(Z)/n $ is known to be consistent under the following regularity conditions.
(C1)
(C2) $ h^r\longrightarrow 0 $ and $ nh^r\longrightarrow \infty $ as $ n\longrightarrow \infty $. This requires $ h $ to be chosen appropriately according to $ n $.
(C3) The density function of $ Z $ and the conditional density function $ f(\cdot|z) $ are twice differentiable and all of the derivatives are bounded.
Using the theory of $ U $-statistic discussed by Fan and Li [13] and Lee [14], we have the following asymptotical normality.
Theorem 3.1(Weak convergence) Assume that conditions (C1)–(C3) hold and the second moment of $ X $ exists, if the conditional density of $ X $ given $ Z $ is symmetric and if $ h\longrightarrow 0 $ and $ nh^r\longrightarrow\infty $ as $ n\longrightarrow\infty $, we have $ nh^{r/2}\mathcal{U}_n\xrightarrow[n\rightarrow\infty]{d}N(0, \sigma^2), $ where $ \sigma^2 $ is given in (3.5).
Proof Let $ P_n(W_1, W_2, W_3) = (|X_1+X_2|-|X_1-X_2|)K_{13}K_{23} $. Note that $ P_n(W_1, W_2, W_3) $ is not symmetric with respect to $ W_1, W_2, W_3 $, so we symmetrize $ P_n $ as
then $ \mathcal{U}_n $ can be expressed as a $ U $-statistic of degree 3 with random kernel,
Denote that
and
We use Lemma B.4 in Fan and Li [13] to obtain the asymptotical distribution of $ \mathcal{U}_n $ under $ H_0 $ in the following steps.
Step 1 Under $ H_0 $, $ E\mathcal{P}_n(W_1, W_2, W_3) = 0 $. Note that
Step 2 Under $ H_0 $, $ E[\mathcal{P}_n(W_1, W_2, W_3)|W_1] = 0. $ Because
which also implies that $ E[P_n(W_1, W_3, W_2)|W_1] = 0 $. Moreover, note that
implies $ E[P_n(W_3, W_2, W_1)|W_1] = 0 $. By the definition of $ \mathcal{P}_n(W_1, W_2, W_3) $, we have
Step 3 $ \sigma_{n3}^2/\sigma_{n2}^2 = o(n). $ Obviously, under $ H_0 $,
For $ E\mathcal{P}^2_{n2}(W_1, W_2) $, we have
where
By considering $ P^2_{n2}(W_1, W_3) $ and $ EP_{n2}(W_1, W_2)P_{n2}(W_1, W_3) $ in a similar way, we get
which implies that $ \sigma_{n2}^2 = E\mathcal{P}^2_{n2}(W_1, W_2) = O_p(h^{3r}). $
For $ E\mathcal{P}^2_{n}(W_1, W_2, W_3) $, we have
with
Similarly, we can prove the rest three terms in (3.4) are all $ O_p(h^{2r}) $, which implies that $ \sigma_{n3}^2 = E\mathcal{P}^2(W_1, W_2, W_3) = O_p(h^{2r}). $ Thus $ \sigma_{n3}^2/\sigma_{n2}^2 = O_P(\frac{1}{h^r}) = o(n) $ holds.
Step 4 We need to prove that, when $ n\longrightarrow \infty $,
As we discussed in Step 3, $ E\mathcal{P}^2_{n2}(W_1, W_2) = O_p(h^{3r}) $. Thus $ (E\mathcal{P}^2_{n2}(W_1, W_2))^2 = O_p(h^{6r}) $. Analogously to $ E\mathcal{P}^2_{n2}(W_1, W_2) $, we can prove that $ E\mathcal{P}^4_{n2}(W_1, W_2) = O_p(h^{5r}) $ by noting that
Moreover,
Therefore
We can verify that $ EG^2_n(W_1, W_2) = O_p(h^{7r}) $ with more transformation $ z_1 = z_2+Hz_{21} $ in the integral. Furthermore $ E\mathcal{G}^2_n(W_1, W_2) = O_p(h^{7r}) $.
Therefore, under the conditions $ nh^r\longrightarrow \infty $ and $ h^r\longrightarrow 0 $, we obtain that
According to Lemma B.4 in Fan and Li [13], it follows that
Therefore, we finally obtain that $ nh^{r/2}\mathcal{U}_n\xrightarrow[n\rightarrow\infty]{D}N(0, \sigma^2) $ with
The following result provides the consistency of $ \mathcal{U}_n $.
Theorem 4.1 (Consistency) Assume that conditions (C1)–(C3) hold and the second moment of $ X $ exists, then as $ n\longrightarrow\infty $, we have $ \mathcal{U}_n\xrightarrow[n\rightarrow\infty]{P}\mathcal{S}_a. $
Proof We will complete this proof by two steps.
Step 1 $ \mathcal{U}_n = E[\mathcal{U}_n]+o_p(1). $
We follow the notation in (9) and (10). According to Lee [14], we have
First, we consider $ \sigma_{n1}^2 $ as follows
which means $ \sigma_{n1}^2\leq E\mathcal{P}^2_{n1}(W_1) = O_p(h^{4r}) $.
Analogously to $ \sigma_{n1}^2 $, we can obtain that
Therefore, we get
So $ \mathcal{U}_n = E[\mathcal{U}_n]+o_p(1) $ by the Chebyshev's inequality.
Step 2 $ E\mathcal{U}_n = E[\frac{1}2\varepsilon^2(X, -X|Z)f^2(Z)]+O_p(h^2). $
Due to the definition of $ \mathcal{P}_n(W_1, W_2, W_3) $, it's easy to verify that
Consider $ E[(|X_1+X_2|-|X_1-X_2|)K_{13}K_{23}] $ as follows
Thus, we get
Combining the results in Step 1 and Step 2, we can finally obtain that