With the expansion of university enrollment, various work to improve students' ability all round was continued to be carried out. How to increasingly improve teaching quality in the courses with large number of students (such as advanced mathematics) are discussed repeatedly. Since the examination scores of the large number of students obey normal distribution, statistical theory is a natural research tool for study of a large scale teaching (see [1, 2]).
The math score of the students of some grades in a university is a random variable ${\xi _{I}}, $ where ${\xi _{I}}\in I={\left[{0, 100}\right)}.$ Assume that the students are taught by divided into $ n $ classes according to their math scores, written as: ${\text{Class}}{\left[a_{1}, a_{2}\right)}$, ${\text{Class}}{\left[a_{2}, a_{3}\right)}$, $\cdots$, ${\text{Class}}{\left[a_{n}, a_{n+1}\right)}, $ where $n\geq 3, 0=a_{1}<a_{2}<\cdots<a_{n+1}=100, $ and $ {a_{i}, a_{i+1}}$ are the lowest and the highest math scores of the students of the ${\text{Class}}{\left[{a_{i}, a_{i+1}}\right)}, $ respectively. This model of teaching is called hierarchical teaching model (see [1-4, 7]). This teaching model is often used in college English and college mathematics teaching. In teaching practice, the previously mentioned score maybe the math score of national college entrance examination or entrance exams which represent the mathematical basis of the students, or in mathematical language, the initial value of the teaching.
No doubt that this teaching model is better than traditional teaching model. However, the real reason for it's high efficiency and the further improvement are not found. As far as we know, not many papers were published to deal these since the difficulty of computing the indefinite integrals involving the normal distribution density function. In [3], by means of numerical simulation, the authors proved the variance of the hierarchical class is smaller. In [4], the authors established some general properties of the variance of the hierarchical teaching, and established a linear model of teaching efficiency of hierarchical teaching model. If the students are divided into Superior-Middle-Poor three classes, the authors believe that the three classes, especially the third one will benefit most from the hierarchical teaching.
In order to study the hierarchical teaching model, we need to give the definition of truncated variables.
Definition 1.1 Let ${\xi _I} \in I$ be a continuous random variable, and let its probability density function (p.d.f.) be $f:I \to \left( {0, \infty }\right).$ If ${\xi _{{I^*}}} \in{I^*} \subseteq I $ is also a continuous random variable and its probability density function is
then we call the random variable ${\xi _{{I^*}}}$ a truncated variable of the random variable ${\xi _I}$, denoted by ${\xi _{{I^*}}} \subseteq {\xi _I};$ if ${\xi _{{I^*}}} \subseteq {\xi _I}$, and ${I^*} \subset I$, then we call the random variable ${\xi _{{I^*}}}$ a proper Truncated Variable of the random variable ${\xi _I}$, denoted by ${\xi _{{I^*}}}\subset {\xi _I}, $ here $I, {I^*} \subseteq \left( { - \infty, \infty } \right)$, $I$ and $I^{*}$ are intervals.
In the hierarchical teaching model, the math score of ${\text{Class}} {\left[{a_{i}, a_{i+1}} \right)}$ is also a random variable ${\xi _{\left[{a_{i}, a_{i+1}}\right)}}\in {\left[{a_{i}, a_{i+1}}\right)}.$ Since ${\left[{a_{i}, a_{i+1}} \right)}\subset {I}, $ we say it is a proper truncated variables of the random variable ${\xi _{I}}, $ written as $ {\xi _{\left[{a_{i}, a_{i+1}}\right)}}\subset {\xi _{I}}, i=1, 2, \cdots, n.$ Assume that ${\text{Class}}{\left[{a_{i}, a_{i+1}}\right)}$ and ${\text{Class}}{\left[{a_{i+1}, a_{i+2}}\right)}$ are merged into one, i.e.,
Since ${\left[{a_{i}, a_{i+1}}\right)}\subset {\left[{a_{i}, a_{i+2}}\right)}\;\text{and}\; {\left[{a_{i+1}, a_{i+2}}\right)}\subset {\left[{a_{i}, a_{i+2}}\right)}, $ we know that ${\xi}_{\left[{a_{i}, a_{i+1}}\right)}$ and ${\xi}_{\left[{a_{i+1}, a_{i+2}}\right)}$ are the proper truncated variables of the random variable $ {\xi}_{\left[{a_{i}, a_{i+2}}\right)}.$
We remark here if ${\xi _I} \in I$ is a continuous random variable, and its p.d.f. is $f:I \to \left( {0, \infty }\right), $ then the integration $\displaystyle\int_I f$ converges, and it satisfies the following two conditions
where $P\left( {{\xi _I} \in \overline I }\right)$ is the probability of the random event ${\xi _I} \in \overline I$, and $\overline I \subseteq I$ is an interval.
According to the definitions of the mathematical expectation $E{\xi _{{I^*}}}$ and the variance $D{\xi _{{I^*}}}$ (see [8, 9]) with Definition 1.1, we are easy to get
and
where ${\xi _{{I^*}}}$ is a truncated variable of the random variable ${\xi _I}.$
In the hierarchical teaching model, what we concerned about is the relationship between the variance of ${\xi _{\left[{a_{i}, a_{i+1}}\right)}}$ and the variance of ${\xi_{I}}$, where $i=1, 2, \cdots, n.$ Its purpose is to determine the superiority and inferiority of the hierarchical teaching model and the traditional mode of teaching. If
then we believe that the hierarchical teaching model is better than the traditional mode of teaching. Otherwise, we believe that the hierarchical teaching model is not worth promoting.
The normal distribution (see [3, 4, 8, 9]) is considered as the most prominent probability distribution in statistics. Besides the important central limit theorem that says the mean of a large number of random variables drawn from a common distribution, under mild conditions, is distributed approximately normally, the normal distribution is also tractable in the sense that a large number of related results can be derived explicitly and that many qualitative properties may be stated in terms of various inequalities.
One of the main practical uses of the normal distribution is to model empirical distributions of many different random variables encountered in practice. For fit the actual data more accurately, many research for generalizing this distribution are carried out. Some representative examples are the following. In 2001, Armando and other authors extended the p.d.f. to the normal-exponential-gamma form which contains four parameters (see [5]). In 2005, Saralees generalized it into the form $K\exp \left\{ { - {{\left| {\frac{{x - \mu }}{\sigma }} \right|}^s}} \right\}$ (see [6]). In 2014, Wen Jiajin rewrote the p.d.f as $k$-Normal Distribution as follows (see [7]).
Definition 2.1 If $\xi$ is a continuous random variable and its p.d.f. is
then we call the random variable $\xi$ follows the $k$-normal distribution, denoted by $\xi\sim {N_k}\left( {\mu , \sigma } \right), $ where $\mu \in \left( { - \infty, \infty } \right), \sigma \in \left( {0, \infty } \right), k \in \left( {1, \infty } \right), $ and $\Gamma \left( s \right) \triangleq \displaystyle\int_0^\infty {{x^{s - 1}}{e^{ - x}}\text{d}x} $ is the gamma function.
For the p.d.f. ${f_{k}^{\mu, \sigma}}\left( t \right)$ of $k$-normal distribution, the graphs of the functions ${f_{3/2}^{0, 1}}(t), {f_{2}^{0, 1}}(t)$ and ${f_{5/2}^{0, 1}}(t)$ are depicted in Figure 1 and ${f_{k}^{0, 1}}(t)$ is depicted in Figure 2.
Integrate the $f_k^{\mu, \sigma }(t)$ on $(-\infty, +\infty)$ by substitutions, we obtain that for all $\mu \in \left( { - \infty, \infty } \right), $ $\sigma \in \left( {0, \infty } \right)$ and $k\in \left( {0, \infty } \right)$, we have
If $\xi \sim{N_k}\left( {\mu, \sigma } \right), $ then we have
Formula (2.2) still holds for all $k>0$.
Lemma 2.1 If $\xi \sim {N_k}\left( {\mu, \sigma } \right), $ then we have
Proof It's easy to obtain that
By the graph of the function $\omega{(k)}$ (depicted in Figure 3), we know that the function $\omega(k)=\frac{k^{-2k}\Gamma{(3k)}}{\Gamma {(k)}}$ is monotonically increasing. Hence the function $ \omega_{*}(k)=\omega \left(\frac{1}{k}\right)=\frac{{{k^{2{k^{ - 1}}}}\Gamma \left( {3{k^{ - 1}}} \right)}}{{\Gamma \left( {{k^{ - 1}}} \right)}} $ is monotonically decreasing. Note that $\omega_{*}(2)=1, $ we get
Using (2.4) and (2.5), we get our desired result (2.3).
According to the previous results, we find that $k$-normal distribution is a new distribution similar to but different from the normal distribution and the generalized normal distribution (see [5, 6]), it is also a natural generalization of the normal distribution, and it can be used to fit a number of empirical distributions with different skewness and kurtosis as well.
We remark here that $k$-normal distribution has similar but distinct form to the generalized normal distribution in [6]. By Definition 2.1, we know that ${f_{2}^{\mu, \sigma}}\left( t \right)$ is the p.d.f. of normal distribution ${N}\left( {\mu, \sigma } \right)$. But the p.d.f. for $s=2$ (in [6]) is
which does not match with normal distribution. So, to a certain extent, $k$-normal distribution is a better form of the generalized normal distribution.
In this section, we will study the relationship among the variances of truncated variables. The main result of the paper is as follows.
Theorem 3.1 Let the p.d.f. $f:I \to \left( {0, \infty }\right)$ of the random variable ${\xi _I}$ be differentiable, and let $D{\xi _{{I_*}}}, D{\xi _{{I^*}}}, D{\xi _I}$ be the variances of the truncated variables ${\xi _{{I_*}}}, {\xi _{{I^*}}}, {\xi _I}, $ respectively. If
(ⅰ) $f:I \to \left( {0, \infty }\right)$ is a logarithmic concave function;
(ⅱ) ${\xi _{{I_*}}} \subset {\xi _I}, {\xi_{{I^*}}} \subset \xi, {I_*} \subset {I^*}$,
then we have the inequalities
Before prove Theorem 3.1, we first establish the following three lemmas.
Lemma 3.1 Let ${\xi _I} \in I$ be a continuous random variable, and let its p.d.f. be $f:I \to \left( {0, \infty } \right).$ If ${\xi _{{I_*}}} \subseteq {\xi _I}, {\xi _{{I^*}}} \subseteq {\xi _I}, {I_*} \subseteq {I^*}, $ then we have
if ${\xi _{{I_*}}}\subseteq {\xi _I}, {\xi _{{I^*}}}\subseteq {\xi _I}, {I_*} \subset {I^*}, $ then we have
Proof By virtue of the hypotheses, we get
thus
It follows therefore from the above facts and Definition 1.1 that we have
Lemma 3.2 Let the function $f:I \to \left( {0, \infty } \right)$ be differentiable. If $f$ is a logarithmic concave function, then we have
Proof We define an auxiliary function $F$ of the variables $u$ and $v$ as
If $v = u, $ then we have $F\left( {u, v} \right) = f\left( u \right) - {\left[{\log f\left( u \right)} \right]'}\displaystyle\int_u^u {f\left( t \right)\text{d}t} = f\left( u \right) > 0.$
By Cauchy mean value theorem, there exists a real number $\theta \in \left( {0, 1}\right)$ for $u\neq v$ such that
If $u < v$, then we have
Combining (3.5) and (3.6), we obtain
So $F\left( {u, v} \right) \geqslant f\left( u \right) > 0.$ This proves inequality (3.4) for $u < v$.
If $u > v$, then we have
Combining (3.5) and (3.7), we obtain
Since $\displaystyle\int_u^v {f\left( t \right)\text{d}t} < 0, $ we have $F\left( {u, v} \right) \geqslant f\left( u \right) > 0.$ So inequality (3.4) is also holds for the last case.
Lemma 3.3 Let the function $f:I \to \left( {0, \infty } \right)$ be differentiable. If $f$ is a logarithmic concave function, then the function
satisfies the following inequalities
Proof For the convenience of notation, two real numbers with same sign $\alpha$ and $\beta$ will be written as $\alpha \sim \beta $.
By the definition, we know that
The power mean inequality asserts (see [10]) that
then we are easy to get
We first prove the case of $u < v$ ,
i.e.,
where
It follows from (3.9) and (3.12) that
Since
so by (3.9) and (3.15), we have
Hence
Combining (3.9), (3.14), (3.17), $v > u$ with Lemma 3.2, we can do the straight calculation as follows
By (3.17) and $v>u$, we get
By (3.16) and (3.18), we get
By (3.19) and $v>u$, we get
From (3.11) and (3.20), for the case of $v>u$, result (3.8) of Lemma 3.3 follows immediately.
Next, we prove the case of $u > v.$ Based on the above analysis, we obtain the following relations
Thus inequalities (3.8) still hold for $u>v$. This completes our proof.
Now we turn our attention to the proof of Theorem 3.1.
Proof Without loss of generality, we can assume that
Note that
If $\alpha \leqslant a < b < \beta, $ so according to(1.2), (3.10) and Lemma 3.3, we get
hence
If $\alpha < a < b \leqslant \beta $, so, according to (1.2), (3.10) and Lemma 3.3, we get
That is to say, inequality (3.21) still holds.
By Lemma 3.1, we have ${\xi _{{I_*}}} \subset {\xi _I}, {\xi _{{I^*}}} \subset {\xi _I}, {I_*} \subset {I^*}\Rightarrow{\xi _{{I_*}}} \subset {\xi _{{I^*}}}.$ Using inequality (3.21) for ${\xi _{{I_*}}}, {\xi _{{I^*}}}$, we can obtain
Combining inequalities (3.21) and (3.22), we get inequalities (3.1).
This completes the proof of Theorem 3.1.
From Theorem 3.1 we know that if the probability density function of the random variable ${\xi _I}$ is differentiable and log concave, and ${\xi _{{I_*}}}$ is the proper truncated variables of the random variable ${\xi _{{I^*}}}, $ the variance of ${\xi _{{I_*}}}$ is less than the variance of ${\xi _{{I^*}}}$. This result is of great significance in the hierarchical teaching model, see the next theorem.
For the convenience of use, Theorem 3.1 can be slightly generalized as follows.
Theorem 3.2 Let $\varphi :I \to \left({-\infty, \infty} \right)$ and $f:I \to \left( {0, \infty } \right)$ be differentiable functions, where $f$ be the p.d.f. of the random variable ${\xi _I}, $ and let $D\varphi \left( {{\xi _{{I_*}}}} \right), D\varphi \left( {{\xi _{{I^*}}}} \right) \text{with} D\varphi \left( {{\xi _I}} \right)$ be the variances of the truncated variables $\varphi \left( {{\xi _{{I_*}}}} \right), \varphi \left( {{\xi _{{I_*}}}} \right)$ with $\varphi \left( {{\xi _I}} \right), $ respectively. If
(ⅰ) ${\varphi '}\left( t \right) > 0, \forall t \in I;$
(ⅱ) the function $\left( {f \circ {\varphi ^{ - 1}}} \right){\left( {{\varphi ^{ - 1}}} \right)'}:\varphi \left( I \right) \to \left( {0, \infty } \right)$ is log concave;
(ⅲ) ${\xi _{{I_*}}} \subset {\xi _I}, {\xi _{{I^*}}} \subset \xi _I, {I_*} \subset {I^*}, $
then we have the following inequalities
Proof Set $\overline \xi = \varphi \left( \xi \right), \overline f = \left( {f \circ {\varphi ^{ - 1}}} \right){\left( {{\varphi ^{ - 1}}} \right)'}.$ By condition (ⅰ), we can see that $\xi = {\varphi ^{ - 1}}\left( {\overline \xi } \right), \overline f = \left( {f \circ {\varphi ^{ - 1}}} \right){\left( {{\varphi ^{ - 1}}} \right)'} > 0$ and
Thus $\overline f :\varphi \left( I \right) \to \left( {0, \infty } \right)$ is a p.d.f. of the random variable $\overline \xi$.
By condition (ⅱ), we can see that $\overline f$ is a logarithmic concave function. Combining conditions (ⅰ) and (ⅲ) with Lemma 3.1, we have
We can deduce from Theorem 3.1 that the following is true
Thus inequalities (3.23) is valid.
In the hierarchical teaching model, the math score of the students of some grade in a university is a random variable ${\xi _{I}}, $ where $I={\left[{0, 100}\right)}, \xi _{I}\subset \xi, $ $\xi \in (-\infty, \infty).$ By using the central limit theorem (see [8]), we know that $\xi$ follows a normal distribution, that is, $\xi\sim{N_2}\left( {\mu, \sigma }\right).$ If, in the grade, the top students and poor students are few, that is to say, the variance $ D{\xi} $ of the random variable ${\xi}$ is small, according to Figure 1 and Figure 2 with Lemma 2.1, we believe that there is a real number $k\in \left[ {2, \infty } \right)$ such that $\xi \sim{N_k}\left( {\mu, \sigma } \right).$ Otherwise, there is a real number $k \in \left( {1, 2} \right)$ such that $\xi \sim{N_k}\left( {\mu, \sigma }\right).$ Then the $k$, $\sigma$ of ${N_k}(\mu, \sigma )$ can be determined according to [5].
We have collected three real data sets $X1$, $X2$ and $X3$, which are all math test score of the students from the unhierarchical, the first level (superior) and the second level (poor) classes, containing 263, 149 and 145 records, respectively. For further analyzing the data, we first estimate parameters $k$, $\mu$, $\sigma$ of $N_k(\mu, \sigma)$, then draw probability density function of $N_k(\mu, \sigma)$ and frequency histogram of the corresponding data set in the same coordinate system, which also contains the probability density function curve graph of normal distribution. After that, we obtain three graphs for $ X1$, $X2$ and $X3$, respectively (see Figure 4, Figure 5 and Figure 6 in Appendix B). These three figures show that $k$-normal distribution is superior to normal distribution since kurtosis is bigger and variance is smaller.
Further more, as shown in the histograms, the variance of $X1$, $X2$ and $X3$ is decreasing. By observing the proportion of scores less than 60 of $X1$, $X2$ and $X3$, we find that the hierarchical teaching model bring better results, and that the second category (represented by $X3$) classes receive more significant benefits from this teaching model.
According to Theorem 3.1 and Lemma 2.1, we have
Theorem 4.1 In the hierarchical teaching model, if $\xi \sim{N_k}\left( {\mu, \sigma }\right), $ where $k>1, $ then for all $i, n: 1\leqslant i \leqslant n-1, n \geqslant 3, $ we have
We accomplish simulation analysis about Theorem 3.1. The procedure of simulation design is shown in Appendix A. The results of the simulation are listed in the tables (see Tables 1-4 in Appendix A). By comparing the data in these tables, we find that, no matter how to change the parameters $k$, $\mu$ or $\sigma$, the variance of truncated variable is strictly less than that of untruncated variable. For example, for any $k$, $\mu$ or $\sigma$ as shown in Tables 1-4,
this does verify the truth of Theorem 3.
From Tables 1 and 3, we see that for each $\sigma$ and $I\subset (-\infty, \infty)$, if
then $D\xi_{1I} < D\xi_{2I} < D\xi_{3I}$.From Tables 2 and 4, for each $\mu$ and $I\subset (-\infty, \infty)$, if
then $D\eta_{1I} < D\eta_{2I} < D\eta_{3I}$. The truth of Theorem 3.1 is verified.
Actually in appendix, the data set X1 is the math test score of unhierarchical students, X2 and X3 are math test score of hierarchical students. We have figured out their variances
The facts $D(X3) < D(X1)$ and $ D(X2) < D(X1) $, just show that the hierarchical teaching is more efficiency than unhierarchical teaching.
The procedure of simulation design is as follows
Step 1 Choose the appropriate parameter $k$, $\mu$ and $\sigma$ in the distribution $N_k(\mu, \sigma)$;
Step 2 Generate 200 random numbers obeying the distribution $\xi\sim N_k(\mu, \sigma)$;
Step 3 Use the 200 numbers to calculate the variance for six truncated $k$-normal variables $\xi _{(-\infty, \infty)}$, $\xi _{[0, 60)}$, $\xi _{[60, 80)}$, $\xi _{[80, 100)}$, $\xi _{[0, 80)}$ and $\xi _{[60, 100)}$;
Step 4 Repeat Step 1 and Step 2 for 50 times;
Step 5 Calculate the mean of 50 variances for each truncated $k$-normal variable, denoted by $D\xi _{(-\infty, \infty)}$, $ D\xi _{[0, 60)}$, $ D\xi _{[60, 80)}$, $ D\xi _{[80, 100)}$, $ D\xi _{[0, 80)}$ and $ D\xi _{[60, 100)}$ respectively;
Step 6 Change the value of $k$, $\mu$ and $\sigma$, and repeat Step 1, Step 2, Step 3, Step 4. All the results are listed in Tables 1-4 (NaN indicates there is no random number for corresponding truncated variable).
The results of curve fltting for three real data sets are as follows (see Figure 4-6)