## Taylor Series: Mathematical Background

### Definitions

Let be a smooth (differentiable) function, and let , then a Taylor series of the function around the point is given by:

In particular, if , then the expansion is known as the Maclaurin series and thus is given by:

### Taylor’s Theorem

Many of the numerical analysis methods rely on Taylor’s theorem. In this section, a few mathematical facts are presented which serve as the basis for Taylor’s theorem. The ideas within the proofs presented here are attributed to Paul’s online calculus notes.

#### Extreme Values of Smooth Functions

##### Definition: Local Maximum and Local Minimum

Let . is said to have a local maximum at a point if there exists an open interval such that and . On the other hand, is said to have a local minimum at a point if there exists an open interval such that and . If has either a local maximum or a local minimum at , then is said to have a local extremum at .

##### Proposition 1

Let be smooth (differentiable). Assume that has a local extremum (maximum or minimum) at a point , then . This proposition is also referred to in some texts as Fermat’s theorem.

View Proof of Proposition 1

Before we present the rigorous proof, I am going to present the intuitive idea upon which the proof is based. Let’s assume a point that is a local maximum, i.e., there is an interval such that is bigger than or equal to where could be any point in . If is to the left of we expect the slope of the line connecting and to be positive and we know that is the limit of the slope of this line as approaches from the left. Similarly, if is on the right of , the slope of the line connection and is negative and again is the limit of the slope of this line as approaches from the right. Since by definition for a limit, both limits from the left and right have to be equal, then, , which is the limit of sequence of positive numbers and another sequence of negative numbers has to be equal to zero. Similar argument applies if is a local minimum. We will now write this in rigorous terms. Let be a local maximum for the smooth differentiable function . Therefore, such that . Therefore, for a sufficiently small we have:

If we restrict ourselves to positive then we have:

By definition, the limit from the right is

If we now restrict ourselves to negative then we have:

By definition, the limit from the left is

The basic assumption for the theorem is that the limit at exists. And since that implies the limit from the left is equal to the limit from the right, therefore:

Similar arguments apply if we assume to be a local minimum.

This proposition simply means that if a smooth function attains a local maximum or minimum at a particular point, then the slope of the function is equal to zero at this point.
As an example, consider the function with the relationship . In this case, is a local maximum value for attained at and is a local minimum value of attained at . These local extrema values are associated with a zero slope for the function since

and are locations of local extrema and for both we have . The red lines in the next figure show the slope of the function at the extremum values.

View Mathematica Code that Generated the Above Figure
Clear[x]
y = x^3 - 3 x;
Plot[y, {x, -3, 3}, Epilog -> {PointSize[0.04], Point[{-1, 2}], Point[{1, -2}], Red, Line[{{-3, 2}, {1.5, 2}}], Line[{{3, -2}, {-1.5, -2}}]}, Filling -> Axis, PlotRange -> All, Frame -> True, AxesLabel -> {"x", "y"}]

View Python Code
import numpy as np
import matplotlib.pyplot as plt
x = np.arange(-3,3,0.01)
y = x**3 - 3*x
plt.plot(x,y)
plt.fill_between(x, y, 0, alpha=0.20)
plt.plot([-3,1.5],[2,2],'r')
plt.plot([3,-1.5],[-2,-2],'r')
plt.plot([-1,1],[2,-2],'ko')
plt.xlabel('x'); plt.ylabel('y')
plt.grid(); plt.show()


“Smoothness” or “Differentiability” is a very important requirement for the proposition to work. As an example, consider the function defined as . The function has a local minimum at , however, is not defined as the slope as from the right is different from the slope as from the left as shown in the next figure.

View Mathematica Code that Generated the Above Figure
Clear[x]
y = Abs[x];
Plot[y, {x, -1, 1}, Epilog -> {PointSize[0.04], Point[{0, 0}]}, PlotRange -> All, Frame -> True, AxesLabel -> {"x", "y"}]

View Python Code
import numpy as np
import matplotlib.pyplot as plt
x = np.arange(-1,1,0.01)
y = abs(x)
plt.plot(x,y)
plt.plot([0],[0],'ko')
plt.xlabel('x'); plt.ylabel('y')
plt.grid(); plt.show()


#### Extreme Value Theorem

Statement: Let be continuous. Then, attains its maximum and its minimum value at some points and in the interval .

The theorem simply states that if we have a continuous function on a closed interval , then the image of contains a maximum value and a minimum value within the interval . The theorem is very intuitive. However, the proof is highly technical and relies on fundamental concepts in Real analysis including the definitions of real numbers and on continuous functions. You can review the Wikipedia entry or a course on Real analysis such as this one for details of the proof. For now, we will just illustrate the meaning of the theorem using an example. Consider the function defined as:

The theorem states that has to attain a maximum value and a minimum value at a point within the interval. In this case, is the maximum value of attained at and is the minimum value of attained at . Alternatively, if with the same relationship as above, is the minimum value of attained at and is the maximum value of attained at
The following figure shows the graph of the function on the specified intervals.

View Mathematica Code that Generated the Above Figures
Clear[x]
y = x^3 - 3 x;
Plot[y, {x, -1.5, 1.5}, Epilog -> {PointSize[0.04], Point[{-1, 2}], Point[{1, -2}]}, Filling -> Axis, PlotRange -> All, Frame -> True, AxesLabel -> {"x", "y"}]
Plot[y, {x, -3, 3}, Epilog -> {PointSize[0.04], Point[{-3, y /. x -> -3}], Point[{3, y /. x -> 3}]}, Filling -> Axis, PlotRange -> All, Frame -> True, AxesLabel -> {"x", "y"}]

View Python Code
import numpy as np
import matplotlib.pyplot as plt
x1 = np.arange(-1.5,1.5,0.01)
y1 = x1**3 - 3*x1
plt.plot(x1,y1)
plt.fill_between(x1, y1, 0, alpha=0.20)
plt.plot([-1,1],[2,-2],'ko')
plt.xlabel('x'); plt.ylabel('y')
plt.grid(); plt.show()

x2 = np.arange(-3,3,0.01)
def f(x): return x**3 - 3*x
y2 = f(x2)
plt.plot(x2,y2)
plt.fill_between(x2, y2, 0, alpha=0.20)
plt.plot([-3,3],[f(-3),f(3)],'ko')
plt.xlabel('x'); plt.ylabel('y')
plt.grid(); plt.show()


The condition that the function is defined on a closed interval is a crucial requirement for the extreme value theorem to hold true. Here is a counter example if this condition is relaxed. Let defined on the open interval . The function is unbounded; it keeps increasing as approaches . The figure below provides the plot of the function defined on the open interval . The function precipituously increases as it approaches the value of .

#### Rolle’s Theorem

Statement: Let be differentiable. Assume that , then there is at least one point where .

View Proof of Rolle's Theorem

The proof of Rolle’s theorem is straightforward from Proposition 1 and the Extreme Value Theorem above. The Extreme Value Theorem ensures that there is an extremum on the interval while Proposition 1 guarantees that the value of the first derivative is zero at the extremum point.

Let .

First, assume that the function is constant, i.e., . In this case, the first derivative at every point is equal to zero. Otherwise, assume that such that . The extreme value theorem asserts that where the function attains its extreme value, which, together with the information about imply that the extreme value point is an interior point to the intervial. I.e., where attains a maximum value. Proposition 1 then guarantees that . The same argument follows if such that .

The Extreme Value Theorem ensures that there is a local maximum or local minimum within the interval, while proposition 1 ensures that at this local extremum, the slope of the function is equal to zero. As an example, consider the function defined as . . This ensures that there is a point with . Indeed, and the point is the location of the local minimum. The following figure shows the graph of the function on the specified interval along with the point .

View Mathematica Code that Generated the Above Figure
Clear[x]
y = 20 (x - 1/2)^3 - 20 (x - 1/2) + 5;
Expand[y]
y /. x -> 1.5
y /. x -> 0.5
y /. x -> (1/2 + 1/Sqrt[3])
D[y, x] /. x -> (1/2 + 1/Sqrt[3])
Plot[y, {x, 0.5, 1.5}, Epilog -> {PointSize[0.04], Point[{1/2 + 1/Sqrt[3], y /. x -> 1/2 + 1/Sqrt[3]}], Red, Line[{{-3, y /. x -> 1/2 + 1/Sqrt[3]}, {1.5, y /. x -> 1/2 + 1/Sqrt[3]}}]}, Filling -> Axis, PlotRange -> All, Frame -> True, AxesLabel -> {"x", "y"}]

View Python Code
import math
import numpy as np
import sympy as sp
import matplotlib.pyplot as plt

def f(x): return 20*(x - 1/2)**3 - 20*(x - 1/2) + 5
print("y(1.5):",f(1.5))
print("y(0.5):",f(0.5))
print("y(1/2 + 1/math.sqrt(3)):",f(1/2 + 1/math.sqrt(3)))
x1 = sp.symbols('x')
print("dy/dx(1/2 + 1/math.sqrt(3)):",sp.diff(20*(x1 - 1/2)**3 - 20*(x1 - 1/2) + 5,x1).subs(x1,1/2 + 1/math.sqrt(3)))

x = np.arange(0.5,1.5,0.01)
y = 20*(x - 1/2)**3 - 20*(x - 1/2) + 5
plt.plot(x,y)
plt.fill_between(x, y, 0, alpha=0.20)
plt.plot([1/2 + 1/math.sqrt(3)],[f(1/2 + 1/math.sqrt(3))],'ko')
plt.plot([0.5,1.5],[f(1/2 + 1/math.sqrt(3)),f(1/2 + 1/math.sqrt(3))],'r')
plt.xlabel('x'); plt.ylabel('y')
plt.grid(); plt.show()

##### Generalized Rolle’s Thoerem

Statement: Let be times differentiable. Assume that is equal to zero at distinct points , then there is at least one point where .

View Proof of the Generalized Rolle's Theorem

The proof uses mathematical induction. We will first show that the argument holds for (2 distinct points). If , then, using Rolle’s theorem, such that .

Similarly, assuming that (three distinct points), i.e., , then using the result from , and such that . Using the results from the first case, such that .

The argument follows for higher values of leading to the statement of the theorem: if has (n+1) distinct zeros such that:

#### Mean Value Theorem

Statement: Let be differentiable. Then, there is at least one point such that .

View Proof of Mean Value Theorem

The proof of the mean value theorem is straightforward by applying Rolle’s theorem to the function defined as:

Clearly, satisfies the conditions of Rolle’s theorem (Differentiable and ). We also have:

Therefore, by Rolle’s theorem, there is a point such that .

The mean value theorem states that there is a point inside the interval such that the slope of the function at is equal to the average slope along the interval. The following example will serve to illustrate the main concept of the mean value theorem. Consider the function defined as:

The slope or first derivative of is given by:

The average slope of on the interval is given by:

The two points and have a slope equal to the average slope:

The figure below shows the function on the specified interval. The line representing the average slope is shown in black connecting the points and . The red lines show the slopes at the points and .

View Mathematica Code that Generated the Above Figure
Clear[x]
y = x^3 - 3 x;
averageslope = ((y /. x -> 3) - (y /. x -> -3))/(3 + 3)
dydx = D[y, x];
a = Solve[D[y, x] == averageslope, x]
Point1 = {x /. a[[1, 1]], y /. a[[1, 1]]}
Point2 = {x /. a[[2, 1]], y /. a[[2, 1]]}
Plot[y, {x, -3, 3},  Epilog -> {PointSize[0.04], Point[{-3, y /. x -> -3}], Point[{3, y /. x -> 3}], Line[{{-3, y /. x -> -3}, {3, y /. x -> 3}}], Point[Point1],Point[Point2], Red, Line[{Point1 + {-1, -averageslope}, Point1, Point1 + {1, averageslope}}], Line[{Point2 + {-1, -averageslope}, Point2, Point2 + {1, averageslope}}]}, Filling -> Axis, PlotRange -> All, Frame -> True, AxesLabel -> {"x", "y"}]

View Python Code
import numpy as np
import sympy as sp
import matplotlib.pyplot as plt

def f(x): return x**3 - 3*x
averageSlope = (f(3) - f(-3))/(3 + 3)
print("averageSlope:",averageSlope)
x1 = sp.symbols('x')
dydx = sp.diff(x1**3 - 3*x1,x1)
print("dy/dx:",dydx)
sol = list(sp.solveset(dydx - averageSlope,x1))
print("Solve:",sol)

Point1 = [sol[0], f(sol[0])]
Point2 = [sol[1], f(sol[1])]
print("Point1:",Point1)
print("Point2:",Point2)

x = np.arange(-3,3,0.01)
y = x**3 - 3*x
plt.plot(x,y)
plt.fill_between(x, y, 0, alpha=0.20)
plt.plot([-3,3,Point1[0],Point2[0]],[f(-3),f(3),Point1[1],Point2[1]],'ko')
plt.plot([Point1[0]-1,Point1[0]+1,Point1[0]],
[Point1[1]-averageSlope,Point1[1]+averageSlope,Point1[1]],'r')
plt.plot([Point2[0]-1,Point2[0]+1,Point2[0]],
[Point2[1]-averageSlope,Point2[1]+averageSlope,Point2[1]],'r')
plt.plot([-3,3],[f(-3),f(3)],'k')
plt.xlabel('x'); plt.ylabel('y')
plt.grid(); plt.show()


#### First and Second Derviative Tests

The Mean Value Theorem precipitates two important results that are fundamental to analyze the behaviour of functions around their extreme values. First, we define the notion of increasing and decreasing functions

##### Definition: Increasing and Decreasing Functions

Let , then:

• is increasing if in
• is decreasing if in

A function is stricly increasing or decreasing if or respectively.

##### Proposition 2: First Derivative Test

Let be a continuous and smooth function. Then:

• If then is increasing
• If then is decreasing
• If then is constant

View Proof of the First Derivative Test

Assume . Let in . Using the mean value theorem, such that

As , therefore implying that . Therefore, is increasing. Similar arguments follow for the second case and third cases.

##### Proposition 3: Second Derivative Test

Let be a continuous and smooth function. Let be such that . Then:

• If then is a local maximum
• If then is a local minimum

View Proof of the Second Derivative Test

Since is continuous, we can assume such that where the interval . We will show that the function is is increasing on the interval and decreasing on the interval . Since is a local extremum by Proposition 1, therefore is a local maximum.

Let . Using the mean value theorem such that:

But , therefore, . Since is arbitrary, the function is increasing on the interval . Similarly, let . Using the mean value theorem such that:

But , therefore, . Since is arbitrary, using Proposition 2 the function is decreasing on the interval . I.e., the function is increasing on the left of and decreasing on the right of . Therefore, is a local maximum.

Similar arguments apply for the second case.

This proposition is very important for optimization problems when a local maximum or minimum is to be obtained for a particular function. In order to identify whether the solution corresponds to a local maximum or minimum, the second derivative of the function can be evaluated. Considering the example given above under Proposition 1, the second derivative is given by:

We have already identified and as locations of the local extremum values. To know whether they are local maxima or local minima, we can evaluate the second derivative at these points. . Therefore, is the location of a local minimum, while implying that is the location of a local maximum.

#### Taylor’s Theorem

As an introduction to Taylor’s Theorem, let’s assume that we have a function that can be represented as a polynomial function in the following form:

where is a fixed point and is a constant. The best way to find these constants is to find and its derivatives when . So, when we have:

Therefore, .

The derivatives of have the form:

The derivatives of when have the form:

Therefore, :

The above does not really serve as a rigorous proof for Taylor’s Theorem but rather an illustration that if an infinitely differentiable function can be represented as the sum of an infinite number of polynomial terms, then, the Taylor series form of a function defined at the beginning of this section is obtained. The following is the exact statement of Taylor’s Theorem:

Statement of Taylor’s Theorem: Let be times differentiable on an open interval . Let . Then, between and such that:

There are many proofs that can be found online for Taylor’s Theorem. Fundamentally, all of them rely on the Mean Value Theorem. We provide one proof in the expandable box below.

View Proof of Taylor's Theorem

This proof is based on this link. Fix the points and and let . Define a function which provides the difference between and the polynomial expantion around point as follows:

Notice that substituting with gives:

The derivative of with respect to its variable gives:

Define a new function of given by:

The derivative of with respect to gives:

Evaluating the function at the points and gives:

Using the Mean Value Theorem, such that:

Rearranging:

Which is the difference between and the polynomial expantion around the point .

Explanation and Importance: Taylor’s Theorem has numerous implications in analysis in engineering. In the following we will discuss the meaning of the theorem and some of its implications:

• Simply put, Taylor’s Theorem states the following: if the function and its derivatives are known at a point , then, the function at a point away from can be approximated by the value of the Taylor’s approximation :

The error (difference between the approximation and the exact is given by:

The term is bounded since is a continuous function on the interval from to . Therefore, when , the upper bound of the error can be given as:

While, when , the upper bound of the error can be given as:

The above implies that the error is directly proportional to . This is traditionally written as follows:

where . In other words, as gets smaller and smaller, the error gets smaller in proportion to . As an example, if we choose and then , then, . I.e., if the step size is halved, the error is divided by .

• If the function is infinitely differentiable on an interval , and if , then is the limit of the sum of the Taylor series. The error which is the difference between the infinite sum and the approximation is called the truncation error as defined in the error section.

• There are many rigorous proofs available for Taylor’s Theorem and the majority rely on the mean value theorem above. Notice that if we choose , then the mean value theorem is obtained. For a rigorous proof, you can check one of these links: link 1 or link 2. Note that these proofs rely on the mean value theorem. In particular, L’HÃ´pital’s rule was used in the Wikipedia proof which in turn relies on the mean value theorem.

The following code illustrates the difference between the function and the Taylor’s polynomial . You can download the code, change the function, the point , and the range of the plot to see how the Taylor series of other functions behave.

View Mathematica Code
Taylor[y_, x_, a_, n_] := (y /. x -> a) +
Sum[(D[y, {x, i}] /. x -> a)/i!*(x - a)^i, {i, 1, n}]
f = Sin[x] + 0.01 x^2;
Manipulate[
s = Taylor[f, x, 1, nn];
Grid[{{Plot[{f, s}, {x, -10, 10}, PlotLabel -> "f(x)=Sin[x]+0.01x^2",
PlotLegends -> {"f(x)", "P(x)"},
PlotRange -> {{-10, 10}, {-6, 30}},
ImageSize -> Medium]}, {Expand[s]}}], {nn, 1, 30, 1}]

View Python Code
import math
import numpy as np
import sympy as sp
import matplotlib.pyplot as plt
from ipywidgets.widgets import interact

def taylor(f,xi,a,n):
return sum([(f.diff(x1,i).subs(x1,a))/math.factorial(i)*(xi - a)**i for i in range(n)])

x1 = sp.symbols('x')
f = sp.sin(x1) + 0.01*x1**2

@interact(n=(1,30,1))
def update(n=1):
x = np.arange(-10,10,0.1)
y = np.sin(x) + 0.01*x**2
plt.plot(x,y, label="f(x)")

p = [taylor(f,xi,1,n) for xi in x]
plt.plot(x,p, label="P(x)")
plt.title("")
plt.xlabel('x'); plt.ylabel('y')
plt.ylim(-6,30); plt.xlim(-10,10)
plt.legend(); plt.grid(); plt.show()
print(sp.series(f,x1,0,n))


The following tool illustrates the difference between the function and the Taylor’s polynomial . You can change the order of the series expansion to see how the Taylor series of the function behave.

The Mathematica function Series can also be used to generate the Taylor expansion of any function:

View Mathematica Code
Series[Tan[x],{x,0,7}]
Series[1/(1+x^2),{x,0,10}]

View Python Code
import sympy as sp
sp.init_printing(use_latex=True)
x = sp.symbols('x')
display("tan(x):",sp.series(sp.tan(x),x,0,8))
display("1/(1+x**2):",sp.series(1/(1+x**2),x,0,11))


The following tool shows how the Taylor series expansion around the point , termed in the figure, can be used to provide an approximation of different orders to a cubic polynomial, termed in the figure. Use the buttons to change the order of the series expansion. The tool provides the error at , namely . What happens when the order reaches 3?

#### Polynomial Interpolation Error

While not related to the Taylor’s Theorem, the error in the interpolating polynomial can be shown to have a form similar to the Taylor’s Theorem error term. The following theorem will be used later in the book when evaluating the error associated with the interpolating polynomial. Similar to Taylor’s Theorem, the proof relies on the Mean Value Theorem above.

Statement of Polynomial Interpolation Error Theorem: Let be times differentiable on an open interval . Let and define the degree interpolating polynomial

Then, between and such that:

View Proof of Polynomial Interpolation Error Theorem

This proof is based on this link. Another useful link which provides some intuitive relation between the Polynomial Interpolation Error and the Taylor Theorem can be found here.

The error between the interpolating polynomial and the function can be defined as:

Notice that as the interpolating polynomial coefficients are obtained by solving the equations of the form where . Fix such that and define the function as:

Where

Utilizing the fact that , the derivative of is given by:

Notice that and . I.e., has distinct roots. Using the generalized Rolle’s theorem, such that . Therefore: