AOS6 Topic 2: Sample Mean Distribution

The sample mean

Let \( X \) be a normal random variable which represents a particular measure on a population (for example, IQ scores or rope lengths). The mean of \( X \) is \( \mu \) and the standard deviation is \( \sigma \). Samples of size \( n \) selected from this population can be described by independent random variables \( X_1, X_2, \ldots, X_n \) with identical distributions to \( X \).

The sample mean is defined as

\( \bar{X} = \frac{{X_1 + X_2 + \ldots + X_n}}{n} \)

Since \( \bar{X} \) is a linear combination of independent normal random variables, the random variable \( \bar{X} \) is also normally distributed.

The expected value of \( \bar{X} \) can be found using our general result for linear combinations:

\( E(\bar{X}) = E\left(\frac{1}{n} (X_1 + X_2 + \cdots + X_n)\right) \)

\( = \frac{1}{n} (E(X_1) + E(X_2) + \cdots + E(X_n)) \) where \( a_1 = a_2 = \cdots = a_n = \frac{1}{n} = n \times \frac{1}{n} \times \mu \) since \( E(X_i) = E(X) = \mu \)

\( = \mu \)

Similarly, we can find the variance of \( \bar{X} \):

\( \text{Var}(\bar{X}) = \text{Var}\left(\frac{1}{n} (X_1 + X_2 + \cdots + X_n)\right) \)

\( = \frac{1}{n^2} (\text{Var}(X_1) + \text{Var}(X_2) + \cdots + \text{Var}(X_n)) \)

\( = n \times \left(\frac{1}{n^2}\right) \times \sigma^2 \)

\( = \frac{\sigma^2}{n} \)

We can summarise our results as follows.

The Sample Mean of a Normal Random Variable

Let \( X \) be a normally distributed random variable with mean \( \mu \) and standard deviation \( \sigma \).

Let \( X_1, X_2, \ldots, X_n \) represent a sample of size \( n \) selected from this population. The sample mean is defined as

\( \bar{X} = \frac{{X_1 + X_2 + \ldots + X_n}}{n} \)

The sample mean \( \bar{X} \) is normally distributed with \( E(\bar{X}) = \mu \) and \( sd(\bar{X}) = \frac{\sigma}{\sqrt{n}} \).

Investigating the distribution of the sample mean using simulation

In the previous section, we made assertions about the distribution of the sample mean \( \bar{X} \), when \( X \) is a normally distributed random variable. In this section, we use simulation to validate these assertions empirically.

Consider the random variable IQ, which we assume is normally distributed with a mean of \( \mu = 100 \) and a standard deviation of \( \sigma = 15 \) in a given population. We will begin by simulating the drawing of a random sample of size 10 from this population.

One random sample of 10 scores, obtained by simulation, is:

105, 109, 104, 86, 118, 100, 81, 94, 70, 88

Recall that the sample mean is denoted by \( \bar{x} \) and that \( \bar{x} = \frac{\sum x}{n} \) where \( \sum \) means ‘sum’ and \( n \) is the size of the sample.

Here the sample mean is:

\[ \bar{x} = \frac{105 + 109 + 104 + 86 + 118 + 100 + 81 + 94 + 70 + 88}{10} = 95.5 \]

A second sample, also obtained by simulation, is:

114, 124, 128, 133, 95, 107, 117, 91, 115, 104

with sample mean:

\[ \bar{x} = \frac{114 + 124 + 128 + 133 + 95 + 107 + 117 + 91 + 115 + 104}{10} = 112.8 \]

Since \( \bar{x} \) varies according to the contents of the random samples, we consider the sample means \( \bar{x} \) as being the values of a random variable, which we denote by \( \bar{X} \).

Since \( \bar{x} \) is a statistic which is calculated from a sample, the probability distribution of the random variable \( \bar{X} \) is called a sampling distribution.

Sampling Distribution of the Sample Mean

The sampling distribution of the sample mean refers to the distribution of sample means obtained from multiple random samples of the same size drawn from a population. In statistical terms, if we repeatedly draw samples of the same size from a population and calculate the mean for each sample, the collection of these sample means forms the sampling distribution of the sample mean.

Example 1

Distribution of Sample Mean

Experience has shown that the heights of a certain population of women can be assumed to be normally distributed with mean \( \mu = 160 \) cm and standard deviation \( \sigma = 8 \) cm. What can be said about the distribution of the sample mean for a sample of size 16?

Solution:

Let \( X \) be the height of a woman chosen at random from this population.

The distribution of the sample mean \( \bar{X} \) is normal with mean \( \mu_{\bar{X}} = \mu = 160 \) and standard deviation \( \sigma_{\bar{X}} = \frac{\sigma}{\sqrt{n}} = \frac{8}{\sqrt{16}} = 2 \).

Example 2

Probability of Height

Consider the population described in \(Example 1\). What is the probability that:

a woman chosen at random has a height greater than \(168 cm\)

a sample of four women chosen at random has an average height greater than \(168 cm\)?

Solution:

a) \( \text{Pr}(X > 168) = \text{Pr}\left(Z > \frac{168 - 160}{8}\right) = \text{Pr}(Z > 1) = 0.1587 \)

b) The distribution of the sample mean \( \bar{X} \) is normal with mean \( \mu_{\bar{X}} = \mu = 160 \) and standard deviation \( \sigma_{\bar{X}} = \frac{\sigma}{\sqrt{n}} = \frac{8}{\sqrt{4}} = 4 \).

Thus \( \text{Pr}(\bar{X} > 168) = \text{Pr}\left(Z > \frac{168 - 160}{4}\right) = \text{Pr}(Z > 2) = 0.0228 \)

Exercise 1

Practice Makes Perfect

Exercise 2

Practice Makes Perfect