AOS6 Topic 1: Linear Combinations of Random Variables

Independent Random Variables:

In probability theory, two random variables A and B are considered independent if their joint probability function (or joint probability distribution) is a product of their individual probability functions (or probability distributions). Mathematically, this can be expressed as:

\( Pr(A \cap B) = Pr(A) \times Pr(B) \)

The sum of two independent identically distributed random variables

We start by investigating sums of random variables that are not only independent, but also identically distributed. This means that they will have the same values for their means and standard deviations.

Consider, for example, the numbers observed when two six-sided dice are rolled. Let \( X_1 \) be the number observed when the first die is rolled, and \( X_2 \) be the number observed when the second die is rolled. The two random variables \( X_1 \) and \( X_2 \) are independent and have identical distributions.

What can we say about the distribution of \( X_1 + X_2 \)?

Since the rolling of these two dice can be considered as independent events, we can find probabilities associated with the sum by multiplying probabilities associated with each individual random variable. For example:

Pr(\( X_1 + X_2 = 2 \)) = Pr(\( X_1 = 1, X_2 = 1 \))

= Pr(\( X_1 = 1 \)) × Pr(\( X_2 = 1 \))

= \( \frac{1}{6} \) × \( \frac{1}{6} \) = \( \frac{1}{36} \)

The mean and variance of the sum of two independent identically distributed random variables

Let \( X \) be a random variable with mean \( \mu \) and variance \( \sigma^2 \). Then if \( X_1 \) and \( X_2 \) are independent random variables with identical distributions to \( X \), we have:

\( E(X_1 + X_2) = E(X_1) + E(X_2) = 2\mu \)

\( \text{Var}(X_1 + X_2) = \text{Var}(X_1) + \text{Var}(X_2) = 2\sigma^2 \)

\( \text{sd}(X_1 + X_2) = \sqrt{\text{Var}(X_1 + X_2)} = \sqrt{2\sigma^2} \)

Note: Since \( \text{sd}(X_1) + \text{sd}(X_2) = 2\sigma \), we see that \( \text{sd}(X_1 + X_2) > \text{sd}(X_1) + \text{sd}(X_2) \) for \( \sigma > 0 \).

Example: Calculation of Expectation and Variance

We can easily determine that \( E(X_1) = 1 \) and \( E(X_2) = \frac{1}{2} \), and we know that \( E(2X_1 + 3X_2) = \frac{7}{2} \).

Thus, we have:

\( E(2X_1 + 3X_2) = 2 E(X_1) + 3 E(X_2) \)

We can calculate \( \text{Var}(X_1) = \frac{2}{3} \) and \( \text{Var}(X_2) = \frac{1}{4} \), and we know that \( \text{Var}(2X_1 + 3X_2) = \frac{59}{12} \).

Thus, we have:

\( \text{Var}(2X_1 + 3X_2) = 2^2 \text{Var}(X_1) + 3^2 \text{Var}(X_2) \)

The sum of \(n\) independent identically distributed random variables

Let \( X \) be a random variable with mean \( \mu \) and variance \( \sigma^2 \). Then if \( X_1, X_2, \ldots, X_n \) are independent random variables with identical distributions to \( X \), we have:

\( E(X_1 + X_2 + \ldots + X_n) = E(X_1) + E(X_2) + \ldots + E(X_n) = n\mu \)

\( \text{Var}(X_1 + X_2 + \ldots + X_n) = \text{Var}(X_1) + \text{Var}(X_2) + \ldots + \text{Var}(X_n) = n\sigma^2 \)

\( \text{sd}(X_1 + X_2 + \ldots + X_n) = \sqrt{\text{Var}(X_1 + X_2 + \ldots + X_n)} = \sqrt{n\sigma^2} \)

Note: The result for the expected value holds even if the random variables \( X_1, X_2, \ldots, X_n \) are not independent.

Linear combination of \( n \) independent random variables:

Let \( X_1, X_2, \ldots, X_n \) be independent random variables with means \( \mu_1, \mu_2, \ldots, \mu_n \) and variances \( \sigma_{1}^2, \sigma_{2}^2, \ldots, \sigma_{n}^2 \) respectively. Then if \( a_1, a_2, \ldots, a_n \) are constants, we have:

\( E(a_1X_1 + a_2X_2 + \ldots + a_nX_n) = a_1 E(X_1) + a_2 E(X_2) + \ldots + a_n E(X_n) \)

\( = a_1\mu_1 + a_2\mu_2 + \ldots + a_n\mu_n \)

\( \text{Var}(a_1X_1 + a_2X_2 + \ldots + a_nX_n) = a_1^2 \text{Var}(X_1) + a_2^2 \text{Var}(X_2) + \ldots + a_n^2 \text{Var}(X_n) \)

\( = a_1^2 \sigma_{1}^2 + a_2^2 \sigma_{2}^2 + \ldots + a_n^2 \sigma_{n}^2 \)

\( \text{sd}(a_1X_1 + a_2X_2 + \ldots + a_nX_n) = \sqrt{a_1^2 \sigma_{1}^2 + a_2^2 \sigma_{2}^2 + \ldots + a_n^2 \sigma_{n}^2} \)

Note: The result for the expected value holds even if the random variables \( X_1, X_2, \ldots, X_n \) are not independent.

Linear combinations of normal random variables

Let \( X_1, X_2, \ldots, X_n \) be independent normal random variables, and let \( a_1, a_2, \ldots, a_n \) be constants. Then the random variable \( a_1X_1 + a_2X_2 + \ldots + a_nX_n \) is also normally distributed.

Example

The time taken to prepare a house for painting is known to be normally distributed with a mean of 10 hours and a standard deviation of 4 hours. The time taken to paint the house is independent of the preparation time and is normally distributed with a mean of 20 hours and a standard deviation of 3 hours.

We are asked to find the probability that the total time taken to prepare and paint the house is more than 35 hours.

Solution

Let \( X \) represent the time taken to prepare the house, and \( Y \) the time taken to paint the house. Since \( X \) and \( Y \) are independent normal random variables, the distribution of \( X + Y \) is also normal, with:

  • \( E(X + Y) = E(X) + E(Y) = 10 + 20 = 30 \)
  • \( \text{Var}(X + Y) = \text{Var}(X) + \text{Var}(Y) = 4^2 + 3^2 = 25 \)
  • \( \text{sd}(X + Y) = \sqrt{25} = 5 \)

Therefore, we have:

\( Pr(X + Y > 35) = Pr\left(Z > \frac{35 - 30}{5}\right) = Pr(Z > 1) = 0.1587 \)

Example 1

Probability Distribution

Suppose that \( X_1 \) is the number observed when one fair die is rolled, and \( X_2 \) is the number observed when another fair die is rolled. Find the probability distribution of \( Y = X_1 + X_2 \).

Solution:

We can construct the following table to determine the possible values of \( Y = X_1 + X_2 \):

1 2 3 4 5 6
1 2 3 4 5 6 7
2 3 4 5 6 7 8
3 4 5 6 7 8 9
4 5 6 7 8 9 10
5 6 7 8 9 10 11
6 7 8 9 10 11 12

For example, consider \( Y = 4 \). The possible outcomes for this value are:

  • \( X_1 = 1, X_2 = 3 \)
  • \( X_1 = 2, X_2 = 2 \)
  • \( X_1 = 3, X_2 = 1 \)

Therefore,

\( \text{Pr}(Y = 4) \)

\( = \text{Pr}(X_1 = 1, X_2 = 3) + \text{Pr}(X_1 = 2, X_2 = 2) + \text{Pr}(X_1 = 3, X_2 = 1) \)

\( = \text{Pr}(X_1 = 1) \times \text{Pr}(X_2 = 3) + \text{Pr}(X_1 = 2) \times \text{Pr}(X_2 = 2) + \text{Pr}(X_1 = 3) \times \text{Pr}(X_2 = 1) \)

\( = \left(\frac{1}{6} \times \frac{1}{6}\right) + \left(\frac{1}{6} \times \frac{1}{6}\right) + \left(\frac{1}{6} \times \frac{1}{6}\right) \)

\( = \frac{3}{36} \)

Continuing in this way, we can obtain the probability distribution of \( Y = X_1 + X_2 \).

y 2 3 4 5 6 7 8 9 10 11 12
Pr(Y = y) \(\frac{1}{36}\) \(\frac{2}{36}\) \(\frac{3}{36}\) \(\frac{4}{36}\) \(\frac{5}{36}\) \(\frac{6}{36}\) \(\frac{5}{36}\) \(\frac{4}{36}\) \(\frac{3}{36}\) \(\frac{2}{36}\) \(\frac{1}{36}\)

Example 2

Finding Mean and Covariance

Consider again the random variable \( Y = X_1 + X_2 \) from previous example. Find:

  1. a) \( E(Y) \)
  2. b) \( \text{Var}(Y) \)

Solution

Using the probability distribution of \( Y = X_1 + X_2 \) from Example 4:

  1. a) \( E(Y) \) =
  2. \( E(Y) = \sum_{y} y \cdot \text{Pr}(Y = y) = \frac{{2 + 6 + 12 + \cdots + 12}}{{36}} = \frac{{252}}{{36}} = 7 \)


  • b) \( \text{Var}(Y) = E(Y^2) - [E(Y)]^2 \)
  • \( E(Y^2) = \sum_{y} y^2 \cdot \text{Pr}(Y = y) = \frac{{4 + 18 + 48 + \cdots + 144}}{{36}} = \frac{{1974}}{{36}} \)

    ∴ \( \text{Var}(Y) = \frac{{1974}}{{36}} - 49 = \frac{{35}}{{6}} \)

    Example 3

    Consider a random variable \( X \) which has a probability distribution as follows:

    x 0 1 2
    Pr(X = x) \(\frac{1}{4}\) \(\frac{1}{2}\) \(\frac{1}{4}\)

    Let \( X_1 \), \( X_2 \), and \( X_3 \) be independent random variables with identical distributions to \( X \).

    1. a) Find the probability distribution of \( X_1 + X_2 + X_3 \).
    2. b) Hence find the mean, variance, and standard deviation of \( X_1 + X_2 + X_3 \).

    Solution

    a) Using a tree diagram or a similar strategy, we can list all the possible combinations of values of \( X_1 \), \( X_2 \), and \( X_3 \) as follows:

    (0, 0, 0), (0, 0, 1), (0, 0, 2), (0, 1, 0), (0, 1, 1), (0, 1, 2), (0, 2, 0), (0, 2, 1), (0, 2, 2)

    (1, 0, 0), (1, 0, 1), (1, 0, 2), (1, 1, 0), (1, 1, 1), (1, 1, 2), (1, 2, 0), (1, 2, 1), (1, 2, 2)

    (2, 0, 0), (2, 0, 1), (2, 0, 2), (2, 1, 0), (2, 1, 1), (2, 1, 2), (2, 2, 0), (2, 2, 1), (2, 2, 2)

    The value of \( X_1 + X_2 + X_3 \) can be determined for each of the 27 outcomes. Since the three random variables are independent, we can determine the probability of each outcome by multiplying the probabilities of the individual outcomes.

    For example:

    Pr(\( X_1 + X_2 + X_3 = 0 \)) = Pr(\( X_1 = 0, X_2 = 0, X_3 = 0 \))

    = Pr(\( X_1 = 0 \)) × Pr(\( X_2 = 0 \)) × Pr(\( X_3 = 0 \))

    = \( \frac{1}{4} \) × \( \frac{1}{4} \) × \( \frac{1}{4} \) = \( \frac{1}{64} \)

    Continuing in this way, we can obtain the probability distribution of \( X_1 + X_2 + X_3 \).

    y 0 1 2 3 4 5 6
    Pr(\( X_1 + X_2 + X_3 = y \)) \( \frac{1}{64} \) \( \frac{3}{32} \) \( \frac{15}{64} \) \( \frac{5}{16} \) \( \frac{15}{64} \) \( \frac{3}{32} \) \( \frac{1}{64} \)

    b) Using the probability distribution from part a, we have

    E(\(X_1 + X_2 + X_3\)) = 0 × \( \frac{1}{64} \) + 1 × \( \frac{3}{32} \) + 2 × \( \frac{15}{64} \) + · · · + 6 × \( \frac{1}{64} \) = 3

    E\[(\(X_1 + X_2 + X_3\))^2\] = \(0^2\) × \( \frac{1}{64} \) + \(1^2\) × \( \frac{3}{32} \) + \(2^2\) × \( \frac{15}{64} \) + · · · + \(6^2\) × \( \frac{1}{64} \) = \( \frac{21}{2} \)

    Var(\(X_1 + X_2 + X_3\)) = \( \frac{21}{2} \) - \(3\) = \( \frac{3}{2} \)

    Thus

    sd(\(X_1 + X_2 + X_3\)) = \( \sqrt{\frac{3}{2}} \) = 1.225

    Example 4

    Finding Variance, and Standard Deviation

    Let \( X \) be a random variable with mean \( \mu = 10 \) and variance \( \sigma^2 = 9 \). If \( X_1, X_2, X_3, X_4 \) are independent random variables with identical distributions to \( X \), find:

    a) \( \text{Var}(X_1 + X_2 + X_3 + X_4) \)

    b) \( \text{sd}(X_1 + X_2 + X_3 + X_4) \)

    Solution:

    Using these formulas we will solve this question:

    \( \text{Var}(X_1 + X_2 + \ldots + X_n) = \text{Var}(X_1) + \text{Var}(X_2) + \ldots + \text{Var}(X_n) = n\sigma^2 \)

    \( \text{sd}(X_1 + X_2 + \ldots + X_n) = \sqrt{\text{Var}(X_1 + X_2 + \ldots + X_n)} = \sqrt{n\sigma^2} \)

    b) \( \text{Var}(X_1 + X_2 + X_3 + X_4) \)

    \( = 4\sigma^2 = 36 \)

    c) \( \text{sd}(X_1 + X_2 + X_3 + X_4) \)

    \( = \sqrt{4\sigma} = 2\sigma = 6 \)

    Example 5

    Let \( X_1 \) and \( X_2 \) be independent random variables with the probability distributions given in the following tables. Find the probability distribution of \( Y = 2X_1 + 3X_2 \).

    \( x_1 \) 0 1 2
    Pr(\( X_1 = x_1 \)) \( \frac{1}{3} \) \( \frac{1}{3} \) \( \frac{1}{3} \)

    \( x_2 \) 0 1
    Pr(\( X_2 = x_2 \)) \( \frac{1}{2} \) \( \frac{1}{2} \)

    Solution:

    We can construct the following table to determine the possible values of \( Y = 2X_1 + 3X_2 \).

    X2 0 1
    X1 0 0 + 3 = 3
    1 2 + 0 = 2 2 + 3 = 5
    2 4 + 0 = 4 4 + 3 = 7

    Since \( X_1 \) and \( X_2 \) are independent, we can determine the probability of each outcome by multiplying the probabilities of the individual outcomes. For example:

    Pr\( (Y = 7) \) = Pr\( (X_1 = 2, X_2 = 1) \) = Pr\( (X_1 = 2) \) × Pr\( (X_2 = 1) \) = \( \frac{1}{3} \) × \( \frac{1}{2} \) = \( \frac{1}{6} \)

    Continuing in this way, we can obtain the probability distribution of \( Y = 2X_1 + 3X_2 \).

    y 0 2 3 4 5 7
    Pr\( (Y = y) \) \( \frac{1}{6} \) \( \frac{1}{6} \) \( \frac{1}{6} \) \( \frac{1}{6} \) \( \frac{1}{6} \) \( \frac{1}{6} \)

    Example 6

    Total Processing Time and Cost

    A manufacturing process involves two stages:

    The time taken to complete the first stage, \( X_1 \) hours, is a continuous random variable with mean \( \mu_1 = 4 \) and standard deviation \( \sigma_1 = 1.5 \).

    The time taken to complete the second stage, \( X_2 \) hours, is a continuous random variable with mean \( \mu_2 = 7 \) and standard deviation \( \sigma_2 = 1 \).

    Assume that the second stage is able to commence immediately after the first stage ends.

    a. Find the mean and standard deviation of the total processing time, if the times taken at each stage are independent.

    b. If the cost of processing is \($200\) per hour for the first stage and \($300\) per hour for the second stage, find the mean and standard deviation of the total processing cost.

    Solution

    a. The total processing time is \( X_1 + X_2 \) hours.

    The mean of the total processing time is:

    \( E(X_1 + X_2) = E(X_1) + E(X_2) = 4 + 7 = 11 \) hours

    The variance of the total processing time is:

    \( \text{Var}(X_1 + X_2) = \text{Var}(X_1) + \text{Var}(X_2) = 1.5^2 + 1^2 = 3.25 \)

    Hence the standard deviation of the total processing time is:

    \( \text{sd}(X_1 + X_2) = \sqrt{3.25} = 1.803 \) hours

    Solution

    b. Let $C$ be the total processing cost. Then \( C = 200X_1 + 300X_2 \).

    The mean of the total processing cost is:

    \( E(C) = E(200X_1 + 300X_2) \)

    \( = 200 E(X_1) + 300 E(X_2) \)

    \( = 200 \times 4 + 300 \times 7 \)

    \( = $2900 \)

    The variance of the total processing cost is:

    \( \text{Var}(C) = \text{Var}(200X_1 + 300X_2) \)

    \( = 200^2 \text{Var}(X_1) + 300^2 \text{Var}(X_2) \)

    \( = 200^2 \times 1.5^2 + 300^2 \times 1^2 \)

    \( = 180 000 \)

    Hence the standard deviation of the total processing cost is:

    \( \text{sd}(C) = \sqrt{180 000} = $424.26 \)

    Example 7

    Probability

    The time taken to prepare a house for painting is known to be normally distributed with a mean of \(10 hours\) and a standard deviation of \(4 hours\).

    The time taken to paint the house is independent of the preparation time and is normally distributed with a mean of \(20 hours\) and a standard deviation of \(3 hours\).

    What is the probability that the total time taken to prepare and paint the house is more than \(35 hours\)?

    Solution

    Let X represent the time taken to prepare the house, and Y the time taken to paint the house. Since X and Y are independent normal random variables, the distribution of X + Y is also normal, with:

    \(E(X + Y) = E(X) + E(Y) = 10 + 20 = 30\)

    \(Var(X + Y) = Var(X) + Var(Y) = 4^2 + 3^2 = 25\)

    \(sd(X + Y) = √Var(X + Y) = √25 = 5\)

    Therefore, the probability that the total time taken to prepare and paint the house is more than \(35 hours\) is:

    \(Pr(X + Y > 35) = Pr(Z > (35 - 30)/5) = Pr(Z > 1) ≈ 0.1587\)

    Exercise &&1&& (&&1&& Question)

    What is the probability of obtaining a sum of \(2\) when rolling two fair six-sided dice

    1
    Submit

    Exercise &&2&& (&&1&& Question)

    The discrete random variable \( X \) has probability distribution \( P(x) = \frac{x}{36} \) for \( x = 1, 2, 3, \ldots, 8 \).

    Find the expected value \( E(X) \) and the variance \( V(X) \).

    2
    Submit

    Exercise &&3&& (&&1&& Question)

    What is the relationship between the variance of \( X_1 + X_2 + X_3 \) and the variance of \( X \)?

    3
    Submit

    Exercise &&4&& (&&1&& Question)

    Let \( X \) be a random variable with mean \( \mu = 10 \) and variance \( \sigma^2 = 9 \). If \( X_1, X_2, X_3, X_4 \) are independent random variables with identical distributions to \( X \), find: \( E(X_1 + X_2 + X_3 + X_4) \)

    4
    Submit

    Exercise &&5&& (&&1&& Question)

    Consider again the random variable \( Y = 2(X_1) + 3(X_2) \) from Example 5. Find:

    a. \( E(Y) \) b. \( \text{Var}(Y) \)

    5
    Submit

    Exercise &&6&& (&&1&& Question)

    The average number of acres burned by forest and range fires in a large New Mexico county is \(4,300 acres\) per year, with a standard deviation of \(750 acres\). The distribution of the number of acres burned is normal.

    What is the probability that between \(2,500 and 4,200\) acres will be burned in any given year?

    6
    Submit