AOS6 Topic 1: Linear Combinations of Random Variables

Independent Random Variables:

In probability theory, two random variables A and B are considered independent if their joint probability function (or joint probability distribution) is a product of their individual probability functions (or probability distributions). Mathematically, this can be expressed as:

\( Pr(A \cap B) = Pr(A) \times Pr(B) \)

The sum of two independent identically distributed random variables

We start by investigating sums of random variables that are not only independent, but also identically distributed. This means that they will have the same values for their means and standard deviations.

Consider, for example, the numbers observed when two six-sided dice are rolled. Let \( X_1 \) be the number observed when the first die is rolled, and \( X_2 \) be the number observed when the second die is rolled. The two random variables \( X_1 \) and \( X_2 \) are independent and have identical distributions.

What can we say about the distribution of \( X_1 + X_2 \)?

Since the rolling of these two dice can be considered as independent events, we can find probabilities associated with the sum by multiplying probabilities associated with each individual random variable. For example:

Pr(\( X_1 + X_2 = 2 \)) = Pr(\( X_1 = 1, X_2 = 1 \))

= Pr(\( X_1 = 1 \)) × Pr(\( X_2 = 1 \))

= \( \frac{1}{6} \) × \( \frac{1}{6} \) = \( \frac{1}{36} \)

The mean and variance of the sum of two independent identically distributed random variables

Let \( X \) be a random variable with mean \( \mu \) and variance \( \sigma^2 \). Then if \( X_1 \) and \( X_2 \) are independent random variables with identical distributions to \( X \), we have:

\( E(X_1 + X_2) = E(X_1) + E(X_2) = 2\mu \)

\( \text{Var}(X_1 + X_2) = \text{Var}(X_1) + \text{Var}(X_2) = 2\sigma^2 \)

\( \text{sd}(X_1 + X_2) = \sqrt{\text{Var}(X_1 + X_2)} = \sqrt{2\sigma^2} \)

Note: Since \( \text{sd}(X_1) + \text{sd}(X_2) = 2\sigma \), we see that \( \text{sd}(X_1 + X_2) > \text{sd}(X_1) + \text{sd}(X_2) \) for \( \sigma > 0 \).

Example: Calculation of Expectation and Variance

We can easily determine that \( E(X_1) = 1 \) and \( E(X_2) = \frac{1}{2} \), and we know that \( E(2X_1 + 3X_2) = \frac{7}{2} \).

Thus, we have:

\( E(2X_1 + 3X_2) = 2 E(X_1) + 3 E(X_2) \)

We can calculate \( \text{Var}(X_1) = \frac{2}{3} \) and \( \text{Var}(X_2) = \frac{1}{4} \), and we know that \( \text{Var}(2X_1 + 3X_2) = \frac{59}{12} \).

Thus, we have:

\( \text{Var}(2X_1 + 3X_2) = 2^2 \text{Var}(X_1) + 3^2 \text{Var}(X_2) \)

The sum of \(n\) independent identically distributed random variables

Let \( X \) be a random variable with mean \( \mu \) and variance \( \sigma^2 \). Then if \( X_1, X_2, \ldots, X_n \) are independent random variables with identical distributions to \( X \), we have:

\( E(X_1 + X_2 + \ldots + X_n) = E(X_1) + E(X_2) + \ldots + E(X_n) = n\mu \)

\( \text{Var}(X_1 + X_2 + \ldots + X_n) = \text{Var}(X_1) + \text{Var}(X_2) + \ldots + \text{Var}(X_n) = n\sigma^2 \)

\( \text{sd}(X_1 + X_2 + \ldots + X_n) = \sqrt{\text{Var}(X_1 + X_2 + \ldots + X_n)} = \sqrt{n\sigma^2} \)

Note: The result for the expected value holds even if the random variables \( X_1, X_2, \ldots, X_n \) are not independent.

Linear combination of \( n \) independent random variables:

Let \( X_1, X_2, \ldots, X_n \) be independent random variables with means \( \mu_1, \mu_2, \ldots, \mu_n \) and variances \( \sigma_{1}^2, \sigma_{2}^2, \ldots, \sigma_{n}^2 \) respectively. Then if \( a_1, a_2, \ldots, a_n \) are constants, we have:

\( E(a_1X_1 + a_2X_2 + \ldots + a_nX_n) = a_1 E(X_1) + a_2 E(X_2) + \ldots + a_n E(X_n) \)

\( = a_1\mu_1 + a_2\mu_2 + \ldots + a_n\mu_n \)

\( \text{Var}(a_1X_1 + a_2X_2 + \ldots + a_nX_n) = a_1^2 \text{Var}(X_1) + a_2^2 \text{Var}(X_2) + \ldots + a_n^2 \text{Var}(X_n) \)

\( = a_1^2 \sigma_{1}^2 + a_2^2 \sigma_{2}^2 + \ldots + a_n^2 \sigma_{n}^2 \)

\( \text{sd}(a_1X_1 + a_2X_2 + \ldots + a_nX_n) = \sqrt{a_1^2 \sigma_{1}^2 + a_2^2 \sigma_{2}^2 + \ldots + a_n^2 \sigma_{n}^2} \)

Note: The result for the expected value holds even if the random variables \( X_1, X_2, \ldots, X_n \) are not independent.

Linear combinations of normal random variables

Let \( X_1, X_2, \ldots, X_n \) be independent normal random variables, and let \( a_1, a_2, \ldots, a_n \) be constants. Then the random variable \( a_1X_1 + a_2X_2 + \ldots + a_nX_n \) is also normally distributed.

Example

The time taken to prepare a house for painting is known to be normally distributed with a mean of 10 hours and a standard deviation of 4 hours. The time taken to paint the house is independent of the preparation time and is normally distributed with a mean of 20 hours and a standard deviation of 3 hours.

We are asked to find the probability that the total time taken to prepare and paint the house is more than 35 hours.

Solution

Let \( X \) represent the time taken to prepare the house, and \( Y \) the time taken to paint the house. Since \( X \) and \( Y \) are independent normal random variables, the distribution of \( X + Y \) is also normal, with:

  • \( E(X + Y) = E(X) + E(Y) = 10 + 20 = 30 \)
  • \( \text{Var}(X + Y) = \text{Var}(X) + \text{Var}(Y) = 4^2 + 3^2 = 25 \)
  • \( \text{sd}(X + Y) = \sqrt{25} = 5 \)

Therefore, we have:

\( Pr(X + Y > 35) = Pr\left(Z > \frac{35 - 30}{5}\right) = Pr(Z > 1) = 0.1587 \)

Example 1

Probability Distribution
Suppose that \( X_1 \) is the number observed when one fair die is rolled, and \( X_2 \) is the number observed when another fair die is rolled. Find the probability distribution of \( Y = X_1 + X_2 \).

Solution:
We can construct the following table to determine the possible values of \( Y = X_1 + X_2 \):
1 2 3 4 5 6
1 2 3 4 5 6 7
2 3 4 5 6 7 8
3 4 5 6 7 8 9
4 5 6 7 8 9 10
5 6 7 8 9 10 11
6 7 8 9 10 11 12
For example, consider \( Y = 4 \). The possible outcomes for this value are:
  • \( X_1 = 1, X_2 = 3 \)
  • \( X_1 = 2, X_2 = 2 \)
  • \( X_1 = 3, X_2 = 1 \)

Therefore,
\( \text{Pr}(Y = 4) \)
\( = \text{Pr}(X_1 = 1, X_2 = 3) + \text{Pr}(X_1 = 2, X_2 = 2) + \text{Pr}(X_1 = 3, X_2 = 1) \)
\( = \text{Pr}(X_1 = 1) \times \text{Pr}(X_2 = 3) + \text{Pr}(X_1 = 2) \times \text{Pr}(X_2 = 2) + \text{Pr}(X_1 = 3) \times \text{Pr}(X_2 = 1) \)
\( = \left(\frac{1}{6} \times \frac{1}{6}\right) + \left(\frac{1}{6} \times \frac{1}{6}\right) + \left(\frac{1}{6} \times \frac{1}{6}\right) \)
\( = \frac{3}{36} \)
Continuing in this way, we can obtain the probability distribution of \( Y = X_1 + X_2 \).
y 2 3 4 5 6 7 8 9 10 11 12
Pr(Y = y) \(\frac{1}{36}\) \(\frac{2}{36}\) \(\frac{3}{36}\) \(\frac{4}{36}\) \(\frac{5}{36}\) \(\frac{6}{36}\) \(\frac{5}{36}\) \(\frac{4}{36}\) \(\frac{3}{36}\) \(\frac{2}{36}\) \(\frac{1}{36}\)

Example 2

Finding Mean and Covariance
Consider again the random variable \( Y = X_1 + X_2 \) from previous example. Find:
  1. a) \( E(Y) \)
  2. b) \( \text{Var}(Y) \)

Solution
Using the probability distribution of \( Y = X_1 + X_2 \) from Example 4:
  1. a) \( E(Y) \) =
  2. \( E(Y) = \sum_{y} y \cdot \text{Pr}(Y = y) = \frac{{2 + 6 + 12 + \cdots + 12}}{{36}} = \frac{{252}}{{36}} = 7 \)

  • b) \( \text{Var}(Y) = E(Y^2) - [E(Y)]^2 \)
  • \( E(Y^2) = \sum_{y} y^2 \cdot \text{Pr}(Y = y) = \frac{{4 + 18 + 48 + \cdots + 144}}{{36}} = \frac{{1974}}{{36}} \)

    ∴ \( \text{Var}(Y) = \frac{{1974}}{{36}} - 49 = \frac{{35}}{{6}} \)

    Example 3

    Consider a random variable \( X \) which has a probability distribution as follows:
    x 0 1 2
    Pr(X = x) \(\frac{1}{4}\) \(\frac{1}{2}\) \(\frac{1}{4}\)
    Let \( X_1 \), \( X_2 \), and \( X_3 \) be independent random variables with identical distributions to \( X \).
    1. a) Find the probability distribution of \( X_1 + X_2 + X_3 \).
    2. b) Hence find the mean, variance, and standard deviation of \( X_1 + X_2 + X_3 \).

    Solution
    a) Using a tree diagram or a similar strategy, we can list all the possible combinations of values of \( X_1 \), \( X_2 \), and \( X_3 \) as follows:
    (0, 0, 0), (0, 0, 1), (0, 0, 2), (0, 1, 0), (0, 1, 1), (0, 1, 2), (0, 2, 0), (0, 2, 1), (0, 2, 2)
    (1, 0, 0), (1, 0, 1), (1, 0, 2), (1, 1, 0), (1, 1, 1), (1, 1, 2), (1, 2, 0), (1, 2, 1), (1, 2, 2)
    (2, 0, 0), (2, 0, 1), (2, 0, 2), (2, 1, 0), (2, 1, 1), (2, 1, 2), (2, 2, 0), (2, 2, 1), (2, 2, 2)
    The value of \( X_1 + X_2 + X_3 \) can be determined for each of the 27 outcomes. Since the three random variables are independent, we can determine the probability of each outcome by multiplying the probabilities of the individual outcomes.
    For example:
    Pr(\( X_1 + X_2 + X_3 = 0 \)) = Pr(\( X_1 = 0, X_2 = 0, X_3 = 0 \))
    = Pr(\( X_1 = 0 \)) × Pr(\( X_2 = 0 \)) × Pr(\( X_3 = 0 \))
    = \( \frac{1}{4} \) × \( \frac{1}{4} \) × \( \frac{1}{4} \) = \( \frac{1}{64} \)
    Continuing in this way, we can obtain the probability distribution of \( X_1 + X_2 + X_3 \).
    y 0 1 2 3 4 5 6
    Pr(\( X_1 + X_2 + X_3 = y \)) \( \frac{1}{64} \) \( \frac{3}{32} \) \( \frac{15}{64} \) \( \frac{5}{16} \) \( \frac{15}{64} \) \( \frac{3}{32} \) \( \frac{1}{64} \)

    b) Using the probability distribution from part a, we have
    E(\(X_1 + X_2 + X_3\)) = 0 × \( \frac{1}{64} \) + 1 × \( \frac{3}{32} \) + 2 × \( \frac{15}{64} \) + · · · + 6 × \( \frac{1}{64} \) = 3
    E\[(\(X_1 + X_2 + X_3\))^2\] = \(0^2\) × \( \frac{1}{64} \) + \(1^2\) × \( \frac{3}{32} \) + \(2^2\) × \( \frac{15}{64} \) + · · · + \(6^2\) × \( \frac{1}{64} \) = \( \frac{21}{2} \)
    Var(\(X_1 + X_2 + X_3\)) = \( \frac{21}{2} \) - \(3\) = \( \frac{3}{2} \)
    Thus
    sd(\(X_1 + X_2 + X_3\)) = \( \sqrt{\frac{3}{2}} \) = 1.225

    Example 4

    Finding Variance, and Standard Deviation

    Let \( X \) be a random variable with mean \( \mu = 10 \) and variance \( \sigma^2 = 9 \). If \( X_1, X_2, X_3, X_4 \) are independent random variables with identical distributions to \( X \), find:
    a) \( \text{Var}(X_1 + X_2 + X_3 + X_4) \)
    b) \( \text{sd}(X_1 + X_2 + X_3 + X_4) \)

    Solution:
    Using these formulas we will solve this question:
    \( \text{Var}(X_1 + X_2 + \ldots + X_n) = \text{Var}(X_1) + \text{Var}(X_2) + \ldots + \text{Var}(X_n) = n\sigma^2 \)
    \( \text{sd}(X_1 + X_2 + \ldots + X_n) = \sqrt{\text{Var}(X_1 + X_2 + \ldots + X_n)} = \sqrt{n\sigma^2} \)

    b) \( \text{Var}(X_1 + X_2 + X_3 + X_4) \)
    \( = 4\sigma^2 = 36 \)
    c) \( \text{sd}(X_1 + X_2 + X_3 + X_4) \)
    \( = \sqrt{4\sigma} = 2\sigma = 6 \)

    Example 5

    Let \( X_1 \) and \( X_2 \) be independent random variables with the probability distributions given in the following tables. Find the probability distribution of \( Y = 2X_1 + 3X_2 \).
    \( x_1 \) 0 1 2
    Pr(\( X_1 = x_1 \)) \( \frac{1}{3} \) \( \frac{1}{3} \) \( \frac{1}{3} \)

    \( x_2 \) 0 1
    Pr(\( X_2 = x_2 \)) \( \frac{1}{2} \) \( \frac{1}{2} \)

    Solution:
    We can construct the following table to determine the possible values of \( Y = 2X_1 + 3X_2 \).
    X2 0 1
    X1 0 0 + 3 = 3
    1 2 + 0 = 2 2 + 3 = 5
    2 4 + 0 = 4 4 + 3 = 7

    Since \( X_1 \) and \( X_2 \) are independent, we can determine the probability of each outcome by multiplying the probabilities of the individual outcomes. For example:
    Pr\( (Y = 7) \) = Pr\( (X_1 = 2, X_2 = 1) \) = Pr\( (X_1 = 2) \) × Pr\( (X_2 = 1) \) = \( \frac{1}{3} \) × \( \frac{1}{2} \) = \( \frac{1}{6} \)
    Continuing in this way, we can obtain the probability distribution of \( Y = 2X_1 + 3X_2 \).
    y 0 2 3 4 5 7
    Pr\( (Y = y) \) \( \frac{1}{6} \) \( \frac{1}{6} \) \( \frac{1}{6} \) \( \frac{1}{6} \) \( \frac{1}{6} \) \( \frac{1}{6} \)

    Example 6

    Total Processing Time and Cost
    A manufacturing process involves two stages:
    The time taken to complete the first stage, \( X_1 \) hours, is a continuous random variable with mean \( \mu_1 = 4 \) and standard deviation \( \sigma_1 = 1.5 \).
    The time taken to complete the second stage, \( X_2 \) hours, is a continuous random variable with mean \( \mu_2 = 7 \) and standard deviation \( \sigma_2 = 1 \).
    Assume that the second stage is able to commence immediately after the first stage ends.
    a. Find the mean and standard deviation of the total processing time, if the times taken at each stage are independent.
    b. If the cost of processing is \($200\) per hour for the first stage and \($300\) per hour for the second stage, find the mean and standard deviation of the total processing cost.

    Solution
    a. The total processing time is \( X_1 + X_2 \) hours.
    The mean of the total processing time is:
    \( E(X_1 + X_2) = E(X_1) + E(X_2) = 4 + 7 = 11 \) hours
    The variance of the total processing time is:
    \( \text{Var}(X_1 + X_2) = \text{Var}(X_1) + \text{Var}(X_2) = 1.5^2 + 1^2 = 3.25 \)
    Hence the standard deviation of the total processing time is:
    \( \text{sd}(X_1 + X_2) = \sqrt{3.25} = 1.803 \) hours

    Solution
    b. Let $C$ be the total processing cost. Then \( C = 200X_1 + 300X_2 \).
    The mean of the total processing cost is:
    \( E(C) = E(200X_1 + 300X_2) \)
    \( = 200 E(X_1) + 300 E(X_2) \)
    \( = 200 \times 4 + 300 \times 7 \)
    \( = $2900 \)
    The variance of the total processing cost is:
    \( \text{Var}(C) = \text{Var}(200X_1 + 300X_2) \)
    \( = 200^2 \text{Var}(X_1) + 300^2 \text{Var}(X_2) \)
    \( = 200^2 \times 1.5^2 + 300^2 \times 1^2 \)
    \( = 180 000 \)
    Hence the standard deviation of the total processing cost is:
    \( \text{sd}(C) = \sqrt{180 000} = $424.26 \)

    Example 7

    Probability
    The time taken to prepare a house for painting is known to be normally distributed with a mean of \(10 hours\) and a standard deviation of \(4 hours\).
    The time taken to paint the house is independent of the preparation time and is normally distributed with a mean of \(20 hours\) and a standard deviation of \(3 hours\).
    What is the probability that the total time taken to prepare and paint the house is more than \(35 hours\)?

    Solution
    Let X represent the time taken to prepare the house, and Y the time taken to paint the house. Since X and Y are independent normal random variables, the distribution of X + Y is also normal, with:
    \(E(X + Y) = E(X) + E(Y) = 10 + 20 = 30\)
    \(Var(X + Y) = Var(X) + Var(Y) = 4^2 + 3^2 = 25\)
    \(sd(X + Y) = √Var(X + Y) = √25 = 5\)

    Therefore, the probability that the total time taken to prepare and paint the house is more than \(35 hours\) is:
    \(Pr(X + Y > 35) = Pr(Z > (35 - 30)/5) = Pr(Z > 1) ≈ 0.1587\)

    Exercise 1

    Exercise 2

    Exercise 3

    Exercise 4

    Exercise 5

    Exercise 6