In Probability and Statistics, the Cumulative Distribution Function (CDF) of a real-valued random variable, say “X”, which is evaluated at x, is the probability that X takes a value less than or equal to x. A random variable is a variable that defines the possible outcome values of an unexpected phenomenon. It is defined for both discrete and random variables. It is also used to specify the distribution of multivariate random variables. If the random variable is above a particular level, it is known as tail distribution or the Complementary Cumulative Distribution Function (CCDF). In this article, you will understand what the cumulative distribution function is, its properties, formulas, applications, and examples.
The Cumulative Distribution Function (CDF) of a real-valued random variable X, evaluated at x, is the probability function that X will take a value less than or equal to x. It is used to describe the probability distribution of random variables in a table. With the help of this data, we can easily create a CDF plot in an Excel sheet.
In other words, CDF finds the cumulative probability for the given value. It is used to determine the probability of a random variable and to compare the probability between values under certain conditions. For discrete distribution functions, CDF gives the probability values up to the specified value, and for continuous distribution functions, it gives the area under the probability density function up to the specified value.
The CDF defined for a discrete random variable is given as:
Fx(x) = P(X ≤ x)
Where X is the probability that takes a value less than or equal to x and that lies in the semi-closed interval (a,b], where a < b.
Therefore, the probability within the interval is written as:
P(a < X ≤ b) = Fx(b) – Fx(a)
The CDF defined for a continuous random variable is given as:
Cumulative Distribution Function
Here, X is expressed in terms of integration of its probability density function fx.
In case the distribution of the random variable X has the discrete component at value b,
P(X = b) = Fx(b) – limx→b- Fx(x)
The cumulative distribution function Fx(x) of a random variable has the following important properties:
For all real numbers a and b with continuous random variable X, the function fx is equal to the derivative of Fx, such that:
Properties of CDF
If X is a completely discrete random variable, it takes the values x1, x2, x3,… with probability pi = p(xi), and the CDF of X will be discontinuous at the points xi:
FX(x) = P(X ≤ x)
This function is defined for all real values; sometimes it is defined implicitly rather than explicitly. The CDF is an integral concept of the PDF (Probability Distribution Function).
Consider a simple example of CDF, which is given by rolling a fair six-sided die, where X is the random variable.
We know that the probability of getting an outcome by rolling a six-sided die is given as:
From this, it is noted that the probability value always lies between 0 and 1, and it is non-decreasing and right-continuous in nature.
The most important application of the cumulative distribution function is in statistical analysis. In statistical analysis, the concept of the CDF is used in two ways:
1. Finding the frequency of occurrence of values for the given phenomena using cumulative frequency analysis.
2. Deriving some simple statistical properties by using an empirical distribution function, which provides a formal direct estimate of CDFs.