An experiment often consists of repeated trials, each of which may be considered as having only two possible outcomes. For example, when a coin is tossed, the two possible outcomes are ‘head’ and ‘tail’. When a die is rolled, the two possible outcomes are determined by the random variable of interest for the experiment. If the event of interest is a ‘six’, then the two outcomes are ‘six’ and ‘not a six’.
A Bernoulli sequence is the name used to describe a sequence of repeated trials with the following properties:
The number of successes in a Bernoulli sequence of \( n \) trials is called a binomial random variable and is said to have a binomial probability distribution.
For example, consider rolling a fair six-sided die three times. Let the random variable \( X \) be the number of 3s observed.
Let \( T \) represent a 3, and let \( N \) represent not a 3. Each roll meets the conditions of a Bernoulli trial. Thus \( X \) is a binomial random variable.
Now consider all the possible outcomes from the three rolls and their probabilities.
Outcome | Number of 3s | Probability |
---|---|---|
TTT | X = 3 | \( \frac{1}{6} \times \frac{1}{6} \times \frac{1}{6} \) |
TTN | X = 2 | \( \frac{1}{6} \times \frac{1}{6} \times \frac{5}{6} \) |
TNT | X = 2 | \( \frac{1}{6} \times \frac{5}{6} \times \frac{1}{6} \) |
NTT | X = 2 | \( \frac{5}{6} \times \frac{1}{6} \times \frac{1}{6} \) |
TNN | X = 1 | \( \frac{1}{6} \times \frac{5}{6} \times \frac{5}{6} \) |
NTN | X = 1 | \( \frac{5}{6} \times \frac{1}{6} \times \frac{5}{6} \) |
NNT | X = 1 | \( \frac{5}{6} \times \frac{5}{6} \times \frac{1}{6} \) |
NNN | X = 0 | \( \frac{5}{6} \times \frac{5}{6} \times \frac{5}{6} \) |
Pr(X = 3) = \( \left(\frac{1}{6}\right)^3 \)
Pr(X = 2) = \( 3 \times \left(\frac{1}{6}\right)^2 \times \frac{5}{6} \)
Pr(X = 1) = \( 3 \times \frac{1}{6} \times \left(\frac{5}{6}\right)^2 \)
Pr(X = 0) = \( \left(\frac{5}{6}\right)^3 \)
Thus the probability distribution of \( X \) is given by the following table:
x | 0 | 1 | 2 | 3 |
---|---|---|---|---|
Pr(X = x) | \( \frac{1}{216} \) | \( \frac{15}{216} \) | \( \frac{75}{216} \) | \( \frac{125}{216} \) |
Consider the probability that X = 1, that is, when exactly one 3 is observed. We can see from the table that there are three ways this can occur. Since the 3 could occur on the first, second or third roll of the die, we can consider this as selecting one object from a group of three, which can be done in \( \frac{3}{1!} \) ways.
Consider the probability that X = 2, that is, when exactly two 3s are observed. Again from the table there are three ways this can occur. Since the two 3s could occur on any two of the three rolls of the die, we can consider this as selecting two objects from a group of three, which can be done in \( \frac{3}{2!} \) ways.
This leads us to a general formula for this probability distribution: \[ \text{Pr}(X = x) = \binom{3}{x} \left(\frac{1}{6}\right)^x \left(\frac{5}{6}\right)^{3-x} \quad \text{for} \quad x = 0, 1, 2, 3 \]
This is an example of the binomial distribution.
If the random variable X is the number of successes in \( n \) independent trials, each with probability of success \( p \), then \( X \) has a binomial distribution, written \( X \sim \text{Bi}(n, p) \), and the rule is:
\[ \text{Pr}(X = x) = \binom{n}{x} p^x (1 - p)^{n-x} \quad \text{for} \quad x = 0, 1, \ldots, n \]where: \[ \binom{n}{x} = \frac{n!}{x!(n - x)!} \]
The binomial distribution is a discrete probability distribution that describes the number of successes in a fixed number of independent Bernoulli trials, each with the same probability of success, denoted by \( p \).
Conditional probability, on the other hand, deals with the probability of an event occurring given that another event has already occurred. It is denoted by \( P(A | B) \), which represents the probability of event \( A \) occurring given that event \( B \) has already occurred.
In the context of the binomial distribution, conditional probability can be used to calculate the probability of certain outcomes given specific conditions or information about previous outcomes.
For example, if we have already observed a certain number of successes in a series of Bernoulli trials, conditional probability can be used to calculate the probability of observing additional successes or failures in subsequent trials.
A probability distribution may be represented as a rule, a table, or a graph. We now investigate the shape of the graph of a binomial probability distribution for different values of the parameters \( n \) and \( p \).
How many heads would you expect to obtain, on average, if a fair coin was tossed 10 times?
While the exact number of heads in the 10 tosses would vary, and could theoretically take values from 0 to 10, it seems reasonable that the long-run average number of heads would be 5. It turns out that this is correct. That is, for a binomial random variable \( X \) with \( n = 10 \) and \( p = 0.5 \),
\[ E(X) = \sum_{x=0}^{10} x \cdot \text{Pr}(X = x) = 5 \]
In general, the expected value of a binomial random variable is equal to the number of trials multiplied by the probability of success. The variance can also be calculated from the parameters \( n \) and \( p \).
If \( X \) is the number of successes in \( n \) trials, each with a probability of success \( p \), then the expected value and the variance of \( X \) are given by:
\[ E(X) = np \]
\[ \text{Var}(X) = np(1 - p) \]
Proof: The binomial theorem, discussed in Appendix A, states that
\[ (a + b)^n = \sum_{k=0}^{n} \binom{n}{k} a^{n-k} b^k \]
Now, using the binomial theorem, the sum of the probabilities for a binomial random variable \( X \) with parameters \( n \) and \( p \) is given by
\[ \sum_{x=0}^{n} Pr(X = x) = \sum_{x=0}^{n} \binom{n}{x} p^x (1 - p)^{n-x} \]
\[ = (1 - p) + p \sum_{x=0}^{n} \binom{n}{x} p^{x-1} (1 - p)^{(n-1)-(x-1)} \]
\[ = (1 - p) + p \cdot 1^n = (1)^n = 1 \]
If \( X \) is a binomial random variable with parameters \( n \) and \( p \), then \( E(X) = np \).
Proof: By the definition of expected value:
\[ E(X) = \sum_{x=0}^{n} x \cdot \binom{n}{x} p^x (1 - p)^{n-x} \] by the distribution formula
\[ = \sum_{x=0}^{n} x \cdot \frac{n!}{x! (n - x)!} p^x (1 - p)^{n-x} \] expanding \(\binom{n}{x}\)
\[ = \sum_{x=1}^{n} x \cdot \frac{n!}{x(x - 1)! (n - x)!} p^x (1 - p)^{n-x} \] since the \( x = 0 \) term is zero
\[ = \sum_{x=1}^{n} x \cdot \frac{n!}{(x - 1)! (n - x)!} p^x (1 - p)^{n-x} \] since \( x! = x(x - 1)! \)
\[ = \sum_{x=1}^{n} \frac{n!}{(x - 1)! (n - x)!} p^x (1 - p)^{n-x} \] cancelling the \( x \)
This expression is very similar to the probability function for a binomial random variable, and we know the probabilities sum to 1. Taking out factors of \( n \) and \( p \) from the expression and letting \( z = x - 1 \) gives
\[ E(X) = np \sum_{x=1}^{n} \frac{(n - 1)!}{(x - 1)! (n - x)!} p^{x-1} (1 - p)^{n-x} = np \sum_{z=0}^{n-1} \frac{(n - 1)!}{z! (n - 1 - z)!} p^z (1 - p)^{n-1-z} \]
Note that this sum corresponds to the sum of all the values of the probability function for a binomial random variable \( Z \), which is the number of successes in \( n - 1 \) trials each with probability of success \( p \). Therefore the sum equals 1, and so
\[ E(X) = np \]
If \( X \) is a binomial random variable with parameters \( n \) and \( p \), then \( \text{Var}(X) = np(1 - p) \).
Proof
The variance of the binomial random variable \( X \) may be found using \( \text{Var}(X) = E(X^2) - \mu^2 \), where \( \mu = np \).
Thus, to find the variance, we need to determine \( E(X^2) \):
\[ E(X^2) = \sum_{x=0}^{n} x^2 \cdot \frac{n!}{x!(n-x)!} p^x (1 - p)^{n-x} \]
But \( x^2 \) is not a factor of \( x! \) and so we cannot proceed as in the previous proof for the expected value.
The strategy used here is to determine \( E[X(X - 1)] \):
\[ E[X(X - 1)] = \sum_{x=0}^{n} x(x - 1) \cdot \frac{n!}{x!(n-x)!} p^x (1 - p)^{n-x} \]
Now the sum corresponds to the sum of all the values of the probability function for a binomial random variable \( Z \), which is the number of successes in \( n - 2 \) trials each with probability of success \( p \), and is thus equal to 1. Hence
\[ E[X(X - 1)] = n(n - 1)p^2 \]
\[ E(X^2) - E(X) = n(n - 1)p^2 \]
Therefore,
\[ E(X^2) = n(n - 1)p^2 + E(X) \]
\[ = n(n - 1)p^2 + np \]
This is an expression for \( E(X^2) \) in terms of \( n \) and \( p \), as required. Thus
\[ \text{Var}(X) = E(X^2) - \mu^2 \]
\[ = n(n - 1)p^2 + np - (np)^2 \]
\[ = np(1 - p) \]