The rule applied for finding the derivative of a composite function (e.g., \( \cos 2x \), \( \log 2x \), etc.) is known as the chain rule. It is also called the composite function rule. The chain rule is applicable only to composite functions. So, before starting with the formula of the chain rule, let us understand the meaning of a composite function and how it can be differentiated.
An expression such as \( q(x) = (x^3 + 1)^2 \) may be differentiated by expanding and then differentiating each term separately. This method is much more tedious for an expression such as \( q(x) = (x^3 + 1)^{30} \).
We can express \( q(x) = (x^3 + 1)^2 \) as the composition of two simpler functions defined by \( u = g(x) = x^3 + 1 \) and \( y = f(u) = u^2 \), which are ‘chained’ together:
x → g → u → f → y
That is, q(x) = (x3 + 1)2 = f(g(x)), and so q is expressed as the composition f ∘ g.
The chain rule gives a method of differentiating such functions.
If g is differentiable at x and f is differentiable at g(x), then the composite function q(x) = f(g(x)) is differentiable at x and
q'(x) = f'(g(x)) · g'(x)
Or using Leibniz notation, where u = g(x) and y = f(u),
dy/dx = (dy/du) × (du/dx)
To find the derivative of \( q = f \circ g \) where \( x = a \), consider the secant through the points \( (a, f \circ g(a)) \) and \( (a + h, f \circ g(a + h)) \). The gradient of this secant is:
\[ \frac{f \circ g(a + h) - f \circ g(a)}{h} \]
We carry out the trick of multiplying the numerator and the denominator by \( g(a + h) - g(a) \). This gives:
\[ \frac{f(g(a + h)) - f(g(a))}{h} \times \frac{g(a + h) - g(a)}{g(a + h) - g(a)} \]
provided \( g(a + h) - g(a) \neq 0 \).
Now write \( b = g(a) \) and \( b + k = g(a + h) \) so that \( k = g(a + h) - g(a) \). The expression for the gradient becomes:
\[ \frac{f(b + k) - f(b)}{k} \times \frac{g(a + h) - g(a)}{h} \]
The function \( g \) is continuous since its derivative exists, and therefore:
\[ \lim_{h \to 0} k = \lim_{h \to 0} \frac{g(a + h) - g(a)}{h} = 0 \]
Thus, as \( h \) approaches 0, so does \( k \). Hence:
\[ q'(a) = f'(g(a)) \cdot g'(a) \]
Note that this proof does not hold for a function \( g \) such that \( g(a + h) - g(a) = 0 \) for arbitrarily small \( h \).
Before using the chain rule to differentiate rational powers, we will show how to differentiate \( x^{\frac{1}{2}} \) and \( x^{\frac{1}{3}} \) by first principles.
Note: We can prove that \( a^n - b^n = (a - b)(a^{n-1} + a^{n-2}b + a^{n-3}b^2 + \cdots + ab^{n-2} + b^{n-1}) \) for \( n \geq 2 \). We could use this result to find the derivative of \( x^{\frac{1}{n}} \) by first principles, but instead we will use the chain rule.
If y is a one-to-one function of x, then using the chain rule in the form \( \frac{dy}{du} = \frac{dy}{dx} \cdot \frac{dx}{du} \) with \( y = u \), we have:
\[ 1 = \frac{dy}{dx} \cdot \frac{dx}{dy} \quad \Rightarrow \quad \frac{dy}{dx} = \frac{1}{\frac{dx}{dy}} \]
Thus, \( \frac{dx}{dy} \neq 0 \).
Now let \( y = x^{\frac{1}{n}} \), where \( n \in \mathbb{Z} \setminus \{0\} \) and \( x > 0 \). We have \( y^n = x \) and so \( \frac{dx}{dy} = ny^{n-1} \). Therefore:
\[ \frac{dy}{dx} = \frac{1}{\frac{dx}{dy}} = \frac{1}{n y^{n-1}} = \frac{1}{n} \left( x^{\frac{1}{n}} \right)^{n-1} = \frac{1}{n} x^{\frac{1}{n} - 1} \]
For \( y = x^{\frac{1}{n}} \), \( \frac{dy}{dx} = \frac{1}{n} x^{\frac{1}{n} - 1} \), where \( n \in \mathbb{Z} \setminus \{0\} \) and \( x > 0 \).
This result may now be extended to rational powers. Let \( y = x^{\frac{p}{q}} \), where \( p, q \in \mathbb{Z} \setminus \{0\} \). Write \( y = \left( x^{\frac{1}{q}} \right)^p \). Let \( u = x^{\frac{1}{q}} \). Then \( y = u^p \). The chain rule yields:
\[ \frac{dy}{dx} = \frac{dy}{du} \cdot \frac{du}{dx} = p u^{p-1} \cdot \frac{1}{q} x^{\frac{1}{q} - 1} = \frac{p}{q} \left( x^{\frac{1}{q}} \right)^{p-1} \cdot x^{\frac{1}{q} - 1} = \frac{p}{q} x^{\frac{p}{q} - 1} \]
Thus, the result for integer powers has been extended to rational powers. In fact, the analogous result holds for any non-zero real power:
For \( f(x) = x^a \), \( f'(x) = ax^{a-1} \), where \( a \in \mathbb{R} \setminus \{0\} \) and \( x > 0 \).
In this section, we investigate the derivative of functions of the form \( f(x) = a^x \). We will see that Euler’s number \( e \) has the special property that \( f'(x) = f(x) \) where \( f(x) = e^x \).
First, consider \( f: \mathbb{R} \to \mathbb{R} \), \( f(x) = 2^x \).
To find the derivative of \( f \), we recall that:
\[ f'(x) = \lim_{h \to 0} \frac{f(x + h) - f(x)}{h} = \lim_{h \to 0} \frac{2^{x+h} - 2^x}{h} = 2^x \lim_{h \to 0} \frac{2^h - 1}{h} = 2^x \cdot f'(0) \]
We can investigate this limit numerically to find that \( f'(0) \approx 0.6931 \) and therefore:
\[ f'(x) \approx 0.6931 \times 2^x \]
Now consider \( g: \mathbb{R} \to \mathbb{R} \), \( g(x) = 3^x \). Then, as for \( f \), it may be shown that:
\[ g'(x) = 3^x \cdot g'(0) \]
We find \( g'(0) \approx 1.0986 \) and hence:
\[ g'(x) \approx 1.0986 \times 3^x \]
For \( f(x) = e^x \), \( f'(x) = e^x \).
Next, consider \( y = e^{kx} \) where \( k \in \mathbb{R} \). The chain rule can be used to find the derivative:
Let \( u = kx \). Then \( y = e^u \). The chain rule yields:
\[ \frac{dy}{dx} = \frac{dy}{du} \cdot \frac{du}{dx} = e^u \cdot k = k e^{kx} \]
For \( f(x) = e^{kx} \), \( f'(x) = k e^{kx} \), where \( k \in \mathbb{R} \).
For the function with the rule \( f(x) = e^x \), we have seen that \( f'(x) = e^x \). This will be used to find the derivative of \( g: \mathbb{R}^+ \to \mathbb{R} \), \( g(x) = \log_e(kx) \) where \( k > 0 \).
Let \( y = \log_e(kx) \) and solve for \( x \):
\[ e^y = kx \quad \therefore \quad x = \frac{1}{k} e^y \]
From our observation above:
\[ \frac{dx}{dy} = \frac{1}{k} e^y \]
Since \( e^y = kx \), this gives:
\[ \frac{dx}{dy} = \frac{kx}{k} = x \]
Thus:
\[ \frac{dy}{dx} = \frac{1}{x} \]
Let \( f: \mathbb{R}^+ \to \mathbb{R} \), \( f(x) = \log_e(kx) \) where \( k > 0 \). Then \( f': \mathbb{R}^+ \to \mathbb{R} \), \( f'(x) = \frac{1}{x} \).