Open Educational Resources

Linear Maps between vector spaces: Basic Definitions

Linear Maps

A linear map T between two vectors spaces V and W is a function T:V\rightarrow W such that \forall u,v\in V,\forall\alpha,\beta\in\mathbb{R}:

    \[ T(\alpha u + \beta v)=\alpha T(u)+\beta T(v) \]

Notice that the addition of two linear maps and their multiplication by scalars produce linear map as well, which imply that the set of linear maps is also a linear vector space.
It is important not to confuse linear maps with affine maps. For example, the function f:\mathbb{R}\rightarrow\mathbb{R} defined such that \forall x\in\mathbb{R}:f(x)=5x +7 is not a linear map but rather an affine map. f is not a liner map since in general f(\alpha u+\beta v)\neq \alpha f( u)+\beta f( v). On the other hand, the function g:\mathbb{R}\rightarrow\mathbb{R} defined such that \forall x\in\mathbb{R}:g(x)=5x is indeed a linear map.


Linear maps between finite dimensional linear vector spaces can also be referred to as Tensors. Tensor analysis provides a natural and concise mathematical tool for the analysis of various engineering problems, in particular, solid mechanics. For a detailed description of tensors, refer to the Wolfram article on tensors.

According to wikipedia, the origin of the word “Tensors” dates back to the nineteenth century when it was introduced by Woldemar Voigt. It is likely that the word originated because one of the early linear operators introduced was the symmetric Cauchy stress matrix which functions to convert area vectors to force vectors. At the time, perhaps the scientists were interested in things that “stretch” and thus, the word “Tensor” from the Latin route “Tendere” came about.

Kernel of Linear Maps

Let T be a linear map between two vector spaces V and W. Then, the kernel of T or \ker(T) is the set of all vectors that are mapped into the zero vector, i.e.:

    \[ \ker(T)=\{x\in V|T(x)=0\} \]

For example, consider the linear map f:\mathbb{R}^2\rightarrow\mathbb{R} defined such that \forall x\in\mathbb{R}^2, f(x)=5x_1+6x_2. Then, the kernel of this linear map consists of al the vectors in \mathbb{R}^2 that are mapped to zero, i.e., the vectors whose components x_1 and x_2 satisfy:

    \[ 5x_1 + 6x_2=0\Rightarrow x_1=-1.2x_2 \]

There are infinitely many vectors that satisfy this condition. The set of all those vectors is given as: \ker(T)=\left\{(-1.2t,t)|t\in\mathbb{R}\right\}

Matrix Representation of Linear Maps

The matrix representation of linear maps is the most convenient way to represent linear maps when orthonormal basis sets are chosen for the underlying vector spaces. Consider the linear map T:\mathbb{R}^n\rightarrow\mathbb{R}^m. Let B=\{e_1,e_2,\cdots,e_n\} and B'=\{e'_1,e'_2,\cdots,e'_m\} be the orthonormal basis sets for \mathbb{R}^n and \mathbb{R}^m respectively. Then, because of the linearity of the map, the map is indeed well defined by the components of the vectors Te_j. Since Te_j\in\mathbb{R}^m we can assume that it has m components which can be denoted as follows:

    \[ Te_j=T_{1j} e'_1+T_{2j} e'_2+\cdots+T_{mj} e'_m=\sum_{i=1}^m T_{ij}e'_i \]

Therefore, \forall x\in\mathbb{R}^n:x=\sum_{j=1}^n(x_je_j) and its image under T has the form:

    \[ T\sum_{j=1}^n(x_je_j)=\sum_{j=1}^n(x_jTe_j)=\sum_{i=1}^m \sum_{j=1}^n T_{ij}x_j e'_i \]

Which in traditional matrix form admits the representation:

    \[ Tx= \left( \begin{array}{cccc} T_{11}&T_{12}&\cdots&T_{1n}\\ T_{21}&T_{22}&\cdots&T_{2n}\\ \vdots & \vdots &\ddots &\vdots\\ T_{m1}&T_{m2}&\cdots &T_{mn} \end{array} \right) \left( \begin{array}{cc} x_1\\ x_2\\ \vdots\\ \vdots\\ x_n \end{array} \right) \]

Notice that the vectors Te_j are the column vectors of the matrix representation T.

Matrix Representation and Change of Basis

The components of the matrix representation of T defined above depend on the choice of the orthonormal basis sets for each vector space. For the discussion in this section, we will restrict ourselves to square matrices, i.e., linear maps between vector spaces of the same dimension.
Let T:\mathbb{R}^n\rightarrow\mathbb{R}^n. Let B=\{e_1,e_2,\cdots,e_n\} be the chosen orthonormal basis set for both vector spaces and let B'=\{e'_1,e'_2,\cdots,e'_n\} be another orthonormal basis set and let Q_{ij}=e'_i\cdot e_j be the matrix of coordinate transformation as defined in the Change of Basis section. The matrix representation of T when B' is chosen as the basis set is denoted by T'. The relationship between T' and T can be obtained as follows:
Let x\in\mathbb{R}^n, denote y=Tx. Let x' and y' denote the representation of x and y when B' is chosen as the coordinate system. Therefore in each coordinate system we have:

    \[ y'=T'x'\hspace{10mm} y=Tx \]

In addition, the relationship between the coordinates in the two coordinate systems is given by:

    \[ x'=Qx\hspace{10mm}y'=Qy \]


    \[ y'=T'x'=T'Qx\hspace{10mm}\text{ and }\hspace{10mm}y'=Qy=QTx\Rightarrow T'Qx=QTx \]

This is true for every x, therefore:

    \[ T'Q=QT\Rightarrow T'=QTQ^T \]

In the following tool, you can choose the components of the matrix M\in\mathbb{M}^2 and the vector u\in\mathbb{R}^2 along with an angle \theta of the counterclockwise rotation of the coordinate system. The tool then applies the transformation of coordinates from the coordinate system B=\{e_1,e_2\}, to B'=\{e'_1,e'_2\} where e'_1, e'_2 are vectors rotated by \theta counterclockwise from e_1, and e_2. On the left hand side, the tool draws the vector u in blue, the vector v=Mu in red, the original coordinate system in black, and the vectors of the new coordinate system in dashed black. At the bottom of the left hand side drawing you will find the expressions for u, M, v, e'_1, and e'_2 using the basis set B. On the right hand side, the tool draws the vectors u' in blue, v'=M'u' in red, and the new coordinate system in black. At the bottom of the right hand side, you will find the expressions for u', M', and v' using the basis set B'.


Similarly, the following tool is for three dimensional Euclidean vector spaces. The new coordinate system B'=\{e'_1,e'_2, e'_3\} is obtained by simultaneously applying a counterclockwise rotation \theta_x, \theta_y, and \theta_z around the first, second, and third coordinate system axis, respectively.


Tensor Product

Let u\in\mathbb{R}^n and v\in\mathbb{R}^m. The tensor product denoted by v\otimes u is a linear map v\otimes u:\mathbb{R}^n\rightarrow\mathbb{R}^m defined such that \forall x\in\mathbb{R}^n:

    \[ (v\otimes u) x=(x\cdot u)v \]

In simple words, the tensor product defined above utilizes the linear dot product operation and a fixed vector u\in \mathbb{R}^n to produce a real number using the expression (x\cdot u), which is conveniently a linear function of x\in\mathbb{R}^n. The resulting number is then multiplied by the vector v\in\mathbb{R}^m.
Obviously, the tensor product of vectors belonging to vector spaces of dimensions higher than 1 are not invertible, in fact, the range of v\otimes u is one dimensional (why?)!
The following are some of the properties of the tensor product that can be deduced directly from the definition and the properties of the dot product operation, \forall u,v\in\mathbb{R}^n,\forall w\in\mathbb{R}^m,\forall a\in\mathbb{R}^l,\forall T:\mathbb{R}^n\rightarrow\mathbb{R}^l,\forall\alpha,\beta\in\mathbb{R}:

    \[ (\alpha u + \beta v)\otimes w=\alpha (u\otimes w) + \beta (v\otimes w) \]

    \[ w\otimes(\alpha u + \beta v)=\alpha (w\otimes u) + \beta (w\otimes v) \]

    \[ (a\otimes u)(v\otimes w)=(u\cdot v)(a\otimes w) \]

    \[ T\left(u\otimes w\right)=Tu\otimes w \]

Another property is that if p,q, and r\in\mathbb{R}^3 are three orthonormal vectors, then:

    \[ I=p\otimes p + q\otimes q + r\otimes r \]

It is important to note that the tensor product defined here is sometimes referred to as the dyadic product or the outer product of two vectors which is a particular type of the more general tensor product.

Matrix Representation of the Tensor Product

Let u,v\in\mathbb{R}^3 and consider the tensor product v\otimes u. Consider the orthonormal basis set B=\{e_1,e_2,e_3\}. Then, the tensor product can be expressed in component form as follows:

    \[ v\otimes u=(v_1e_1+v_2e_2+v_3e_3)\otimes (u_1e_1+u_2e_2+u_3e_3)=\sum_{i,j=1}^3 v_iu_j (e_i\otimes e_j) \]

Now, \forall x\in\mathbb{R}^3 we have:

    \[ (v\otimes u)x=\sum_{i,j,k=1}^3 v_iu_j (e_i\otimes e_j)(x_ke_k)=\sum_{i,j=1}^3 v_iu_j x_j e_i \]

Which, can be represented in matrix form as follows:

    \[ (v\otimes u)x= \left( \begin{array}{cc} \sum_{j=1}^3 v_1u_j x_j\\ \sum_{j=1}^3 v_2u_j x_j\\ \sum_{j=1}^3 v_3u_j x_j \end{array} \right)= \left( \begin{array}{ccc} v_1u_1&v_1u_2&v_1u_3\\ v_2u_1&v_2u_2&v_2u_3\\ v_3u_1&v_3u_2&v_3u_3 \end{array} \right) \left( \begin{array}{cc} x_1\\ x_2\\ x_3 \end{array} \right) \]

Tensor Product Representation of Linear Maps

A linear map can be decomposed into the sum of multiple tensor products. For example, one can think of a linear map T between three dimensional vector spaces, as the sum of three tensor products:

    \[ T=u\otimes x +v\otimes y + w\otimes z \]

For this map to be invertible, each of the sets \{u,v.w\} and \{x,y,z\} has to be linearly independent (why?).

There is a direct relationship between the tensor product representation and the matrix representation as follows: let T:\mathbb{R}^3\rightarrow\mathbb{R}^3 and let B=\{e_1,e_2,e_3\} be an orthonormal basis set for both vector spaces, then, \forall x\in\mathbb{R}^3:

    \[ Tx=T\sum_{j=1}^3(x_je_j)=\sum_{j=1}^3(x_jTe_j)=\sum_{i=1}^3 \sum_{j=1}^3 T_{ij}x_j e_i=\sum_{i=1}^3 \sum_{j=1}^3 T_{ij}(x\cdot e_j) e_i=\sum_{i=1}^3 \sum_{j=1}^3 T_{ij}(e_i\otimes e_j) x \]

Therefore, any linear map T:\mathbb{R}^3\rightarrow\mathbb{R}^3 can be represented as the sum of nine tensor product components

    \[ T=\sum_{i=1}^3 \sum_{j=1}^3 T_{ij}(e_i\otimes e_j) \]


The Set of Linear Maps

In these pages, the notation \mathbb{B}(\mathbb{R}^n,\mathbb{R}^m) is used to denote the set of linear maps between \mathbb{R}^n and \mathbb{R}^m. i.e.,:

    \[ \mathbb{B}(\mathbb{R}^n,\mathbb{R}^m)=\{T:\mathbb{R}^n\rightarrow \mathbb{R}^m|T \text{ is linear}\} \]

In addition, the short notation \mathbb{B}(\mathbb{R}^n) is used to denote the set of linear maps between \mathbb{R}^n and \mathbb{R}^n. i.e.,

    \[ \mathbb{B}(\mathbb{R}^n)=\{T:\mathbb{R}^n\rightarrow \mathbb{R}^n|T \text{ is linear}\} \]

We also freely use the set of n\times n matrices denoted \mathbb{M}^n to denote \mathbb{B}(\mathbb{R}^n).


The Algebraic Structure of the Set of Linear Maps

In addition to being a vector space, the elements of the sets of linear maps has an algebraic structure arising naturally from the composition operation. Let U\in\mathbb{B}(\mathbb{R}^n,\mathbb{R}^m) and V\in\mathbb{B}(\mathbb{R}^m,\mathbb{R}^l), then, the composition map T=V\circ U:\mathbb{R}^n\rightarrow\mathbb{R}^l is also a linear map since\forall x,y\in\mathbb{R}^n,\forall \alpha,\beta\in\mathbb{R}:

    \[ \begin{split} T(\alpha x + \beta y)&=V\circ U(\alpha x + \beta y)\\ &=V(U((\alpha x + \beta y)))\\ &=V(\alpha U(x)+\beta U(y))\\ &=\alpha V(U(x)) + \beta V(U(y))\\ &=\alpha T(x) + \beta T(y) \end{split} \]

Let x\in\mathbb{R}^n. If M and N are the matrices associated with the linear maps U and V respectively, then, the components of the matrix L associated with the linear map T can be obtained from the equality: Lx=NMx:

    \[ \begin{split} Lx&=\left( \begin{array}{c} \sum_{j=1}^nL_{1j}x_j\\ \sum_{j=1}^nL_{2j}x_j\\ \cdots\\ \sum_{j=1}^nL_{lj}x_j \end{array} \right)\\ NMx&=N\left( \begin{array}{c} \sum_{j=1}^nM_{1j}x_j\\ \sum_{j=1}^nM_{2j}x_j\\ \cdots\\ \sum_{j=1}^nM_{mj}x_j \end{array}\right) = \left( \begin{array}{c} \sum_{k=1}^m\sum_{j=1}^nN_{1k}M_{kj}x_j\\ \sum_{k=1}^m\sum_{j=1}^nN_{1k}M_{kj}x_j\\ \cdots\\ \sum_{k=1}^m\sum_{j=1}^nN_{lk}M_{kj}x_j \end{array} \right) \end{split} \]

Therefore, the components L_{ij} can be calculated from the components of N and M as follows:

    \[ L_{ij}=\sum_{k=1}^mN_{ik}M_{kj} \]

In otherwords, the component in the i^{th} row and the j^{th} column of L is equal to the multiplication of the components in the i^{th} row of N by the components in the j^{th} column of M. Notice that the operation NM is well defined while the operation MN isn’t because of the difference in the dimensions n\neq m \neq l of the above spaces.

However, if U,V\in\mathbb{B}(\mathbb{R}^n), and their respective associated matrices are M,N\in\mathbb{M}^n then both composition maps are well defined. The first one is the composition map U\circ V with its associated matrix MN while the second is the composition map V\circ U and its associated matrix NM. In general, these two maps are not identical.

The identity map Ix=x and its associated identity matrix I is the identity element in the algebraic structure of \mathbb{B}(\mathbb{R}^n)

Bijective (Invertible) Linear Maps:

In this section, we are concerned with the linear maps represented by square matrices M\in\mathbb{M}^n and whether these linear maps (linear functions) are invertible or not. Recall from the Mathematical Preliminaries section that a function T:X\rightarrow Y is invertible if \exists G:Y\rightarrow X such that G(T(x))=x. G is denoted by T^{-1}. Let’s now consider the linear map (represented by a matrix T) T:\mathbb{R}^n\rightarrow\mathbb{R}^n, what are the conditions that guarantee the existence of T^{-1} such that T^{-1}T=I where I is the identity matrix? We will answer this question using a few statements:

Statement 1: Let T:\mathbb{R}^n\rightarrow\mathbb{R}^n be a linear map. Then T \mbox{ is injective}\Leftrightarrow \ker{T}=\{0\}.
This statement is simple to prove. First note that since T is a linear map, then T(0)=0.
First, assume T is injective. Since T(0)=0 and since T is injective therefore, 0 is the unique image of 0. Therefore, \ker{T}=\{0\}. For the opposite statement, assume that \ker{T}=\{0\}. We will argue by contradiction, i.e., assuming that T is not injective. Therefore, \exists x,y\in\mathbb{R}^n with x\neq y but Tx=Ty. Since T is linear we have Tx=Ty\Rightarrow T(x-y)=0. Therefore, x-y\in\ker{T} which is a contradiction. Therefore T is injective.


Statement 2: Let T:\mathbb{R}^n\rightarrow\mathbb{R}^n be a linear map. Then T \mbox{ is invertible}\Leftrightarrow \ker{T}=\{0\}.
First assume that T is invertible, therefore, T is injective. Statement 1 asserts then that \ker{T}=\{0\}.
Assume now that \ker{T}=\{0\}. Therefore, from statement 1, T is injective. We need to show that T is surjective. Note that using Statement 1, and since an invertible map is also injective, then we just need to show that T \mbox{ is surjective}. This can be proven by picking a basis set B=\{e_1,e_2,\cdots,e_n\} for \mathbb{R}^n and showing that the set \{Te_1,Te_2,\cdots,Te_n\} is linearly independent which right away implies that T is surjective. Since T is injective and B is linearly independent we have:

    \[ T\left(\sum_{i=1}^n\alpha_ie_i\right)=\sum_{i=1}^n\alpha_iTe_i=0 \Leftrightarrow \sum_{i=1}^n\alpha_ie_i=0 \Leftrightarrow \forall i:\alpha_i=0 \]


    \[ \sum_{i=1}^n\alpha_iTe_i=0 \Leftrightarrow \forall i:\alpha_i=0 \]

Therefore, Te_i are linearly independent set of vectors in \mathbb{R}^n and in the range of T. Therefore. \forall y\in \mathbb{R}^n:\exists y_i such that

    \[ y=y_1Te_1+y_2Te_2+\cdots+y_nTe_n=T(y_1e_1+y_2e_2+\cdots+y_ne_n) \]

Therefore, x=y_1e_1+y_2e_2+\cdots+y_ne_n is the preimage of y. Therefore, T is surjective.


Statement 3: Let T:\mathbb{R}^n\rightarrow\mathbb{R}^n be a linear map. Then T \mbox{ is invertible}\Leftrightarrow the n vectors forming the square matrix of T are linearly independent.
First assume that \{v_i\}_{i=1}^n are n linearly independent vectors that form the row vectors of the linear map T:\mathbb{R}^n\rightarrow \mathbb{R}^n. We will argue by contradiction. Assume that \exists v\neq 0 and v\in\ker{T}. Then, \forall i:v_i\cdot v=0. However, since \{v_i\}_{i=1}^n are linearly independent, they form a basis set and v can be expressed in terms of all of them. Therefore v=\sum_{i=1}^n\alpha_i v_i. But v is orthogonal to all of them, then \forall i:\alpha_i=0. Therefore, \ker{T}=\{0\} and the map is bijective using statement 2.
For the opposite direction, assume that the map is bijective yet \{v_i\}_{i=1}^n are linearly dependent. Since they are linearly dependent, therefore there is at least one vector that can be represented as a linear combination of the other vectors. Without loss of generality, assume that v_n=\sum_{i=1}^{n-1}\alpha_i v_i. Therefore

    \[\begin{split} \forall x\in\mathbb{R}^n:Tx&=\sum_{i=1}^n (v_i\cdot x)e_i=\sum_{i=1}^{n-1}\left((v_i\cdot x)e_i\right)+\left(\left(\sum_{i=1}^{n-1}\alpha_iv_i\right)\cdot x\right)e_n\\ & =\sum_{i=1}^{n-1}(v_i\cdot x)(e_i+\alpha_ie_n) \end{split} \]

This shows that the range of T has at most n-1 dimensions, therefore T is not surjective, which is a contradiction.



Statement 3 asserts that a square matrix is invertible if and only if the rows are linearly independent. In the following section, we will present the determinant of a matrix as a measure of whether the rows are linearly independent or not.


The determinant of a matrix representation of a linear map is a real valued function of the components of a square matrix. The determinant is used to indicate whether the rows of the matrix M\in\mathbb{M}^n are linearly dependent or not. If they are, then the determinant is equal to zero, otherwise, the determinant is not equal to zero. In the following, we will show the definition of the determinant function for n=2, n=3 and for a general n. We will also verify that the determinant of M is equal to zero if and only if the row vectors of the matrix are linearly dependent for the cases n=2 and n=3.

Determinant of M\in\mathbb{M}^2:

Let M\in\mathbb{M}^2 such that

    \[ M=\left(\begin{matrix}a_1 & a_2\\b_1& b_2\end{matrix}\right) \]

The determinant of M is defined as:

    \[ \det{M}=a_1b_2-a_2b_1 \]

Clearly, the vectors a=\{a_1, a_2\} and b=\{b_1,b_2\} are linearly dependent if and only if \det{M}=0. The determinant of the matrix M has a geometric meaning (See Figure 1). Consider the two unit vectors e_1=\{1,0\} and e_2=\{0,1\}. Let x_1=Me_1 and x_2=Me_2. The area of the parallelogram formed by x_1 and x_2 is equal to the determinant of the matrix M.
The following is true \forall N,M\in\mathbb{M}^2 and \forall \alpha\in\mathbb{R}:

    \[\begin{split} \det{(NM)} & =\det{N}\det{M}\\ \det{\alpha M}& =\alpha^2\det{M}\\ \det{I} & = 1 \end{split} \]

where I\in\mathbb{M}^2 is the identity matrix.

Figure 1. Area transformation under M^2

Figure 1. Area transformation under \mathbb{M}^2

Determinant of M\in\mathbb{M}^3:

Let M\in\mathbb{M}^3 such that

    \[ M=\left(\begin{matrix}a_1 & a_2 & a_3\\b_1& b_2& b_3\\c_1&c_2&c_3\end{matrix}\right) \]

If a=\{a_1,a_2,a_3\}, b=\{b_1,b_2,b_3\}, and c=\{c_1,c_2,c_3\}, then the determinant of M is defined as:

    \[ \det{M}=a\cdot (b\times c) \]

I.e., \det{M}= the tripe product of a, b, and c. From the results of the triple product, the vectors a, b, and c are linearly dependent if and only if \det{M}=0. The determinant of the matrix M has a geometric meaning (See Figure 2). Consider the three unit vectors e_1=\{1,0,0\}, e_2=\{0,1,0\}, and e_3=\{0,0,1\}. Let x_1=Me_1, x_2=Me_2, and x_3=Me_3. The determinant of M is also equal to the triple product of x_1, x_3, and x_3 and gives the volume of the parallelepiped formed by x_1, x_2, and x_3.

    \[\begin{split} \det{M}&=a\cdot(b\times c)=a_1(b_2c_3-b_3c_2)+a_2(b_3c_1-b_1c_3)+a_3(b_1c_2-b_2c_1)\\ &=a_1(b_2c_3-b_3c_2)+b_1(a_3c_2-a_2c_3)+c_1(a_2b_3-b_2a_3)\\ &=x_1\cdot (x_2\times x_3)\\ &=Me_1\cdot (Me_2\times Me_3) \end{split} \]

Additionally, \forall u, v, w\in\mathbb{R}^3 and u,v,w are linearly independent, it is straightforward to show the following:

    \[ \det{M}=\frac{Mu\cdot(Mv\times Mw)}{u\cdot (v\times w)} \]

In other words, the determinant gives the ratio between V_{\mbox{Transformed}} and V_{\mbox{original}} where V_{\mbox{Transformed}} is the volume of the transformed parallelepiped between Mu, Mv, and Mw and V_{\mbox{original}} is the volume of the parallelepiped between u, v, and w.
The alternator \varepsilon_{ijk} defined in Mathematical Preliminaries can be used to write the followign useful equality:

    \[ \det{M}=Me_1\cdot(Me_2\times Me_3)\Rightarrow Me_i\cdot(Me_j\times Me_k)=\varepsilon_{ijk} \det{M} \]

The following is true \forall N,M\in\mathbb{M}^3 and \forall \alpha\in\mathbb{R}:

    \[\begin{split} \det{(NM)} & =\det{N}\det{M}\\ \det{\alpha M}& =\alpha^3\det{M}\\ \det{I} & = 1 \end{split} \]

where I\in\mathbb{M}^3 is the identity matrix.

Figure 2. Volume transformation under M^3

Figure 2. Volume transformation under \mathbb{M}^3

Area Transformation in \mathbb{R}^3:

The following is a very important formula (often referred to as “Nanson’s Formula”) that relates the cross product of vectors in \mathbb{R}^3 to the cross product of their images under a linear transformation. This formula is used to relate area vectors before mapping to area vectors after mapping.


Let u,v\in\mathbb{R}^3. Let M\in\mathbb{M}^3 be an invertible matrix. Show the following relationship:

    \[ Mu\times Mv = (\det{M})M^{-T}\left(u\times v\right) \]


Let w be an arbitrary vector in \mathbb{R}^3. From the relationships above we have:

    \[ Mw\cdot \left(Mu\times Mv\right)=(\det{M}) w\cdot (u\times v) \]


    \[ w\cdot M^T\left(Mu\times Mv\right)=(\det{M}) w\cdot (u\times v) \]

Since w is arbitrary, it is straightforward to show that the vectors M^T\left(Mu\times Mv\right) and (\det{M})(u\times v) are equal. And, since M is invertible, so, is M^T. Therefore:

    \[ Mu\times Mv = (\det{M})M^{-T}\left(u\times v\right) \]


Nanson’s formula is sometimes written as follows:

    \[ a n = (\det{M})M^{-T}\left(A N\right) \]


    \[\begin{split} A&=\|u\times v\|\\ N&=\frac{1}{A}(u\times v)\\ a&=\|Mu\times Mv\|\\ n&=\frac{1}{a}(Mu\times Mv) \end{split} \]

Determinant of M\in\mathbb{M}^n:

The determinant of M\in\mathbb{M}^n is defined using the recursive relationship:

    \[ \det{M}=\sum_{i=1}^n(-1)^{(i+1)}M_{1i}\det{N_i} \]

where N_i\in\mathbb{M}^{n-1} and is formed by eliminating the 1st row and i^{th} column of the matrix M. It can be shown that \det{M}=0\Leftrightarrow the rows of M are linearly dependent.

Eigenvalues and Eigenvectors

Let M:\mathbb{R}^n\rightarrow\mathbb{R}^n. \lambda \in \mathbb{R} is called an eigenvalue of the tensor M if \exists p \in \mathbb{R}^n, p\neq 0 such that Mp=\lambda p. In this case, p is called an eigenvector of M associated with the eigenvalue \lambda.

Notice that any nonzero multiplier of an eigenvector is again an eigenvector: If Mp=\lambda p then \forall 0\neq\alpha \in \mathbb{R}:M(\alpha p)=\lambda (\alpha p)\Rightarrow \alpha p is an eigenvector of M. In addition, a nonzero linear combination of eigenvectors associated with the same eigenvalue is also an eigenvector:
If Mp=\lambda p and Mq=\lambda q then \forall \alpha,\beta \in \mathbb{R} with \alpha\neq 0 and \beta\neq 0: M(\alpha p+\beta q)= \alpha Mp+\beta Mq=\lambda(\alpha p + \beta q)\Rightarrow (\alpha p+\beta q) is an eigenvector of M.


Similar Matrices

Let M:\mathbb{R}^n\rightarrow\mathbb{R}^n. Let T:\mathbb{R}^n\rightarrow\mathbb{R}^n be an invertible tensor. The matrix representations of the tensors M and TMT^{-1} are termed “similar matrices”.
Similar matrices have the same eigenvalues while their eigenvectors differ by a linear transformation as follows: If \lambda\in\mathbb{R} is an eigenvalue of M with the associated eigenvector p then:

    \[ Mp=\lambda p\Rightarrow MT^{-1}Tp=\lambda T^{-1}Tp\Rightarrow TMT^{-1}(Tp)=\lambda (Tp) \]

Therefore, \lambda is an eigenvalue of TMT^{-1} and Tp is the associated eigenvector. Similarly, if \lambda\in\mathbb{R} is an eigenvalue of TMT^{-1} with the associated eigenvector q then:

    \[ TMT^{-1}q=\lambda q\Rightarrow MT^{-1}q=\lambda T^{-1}q\Rightarrow M(T^{-1}q)=\lambda (T^{-1}q) \]

Therefore, \lambda is an eigenvalue of M and T^{-1}q is the associated eigenvector. Therefore, similar matrices share the same eigenvalues.


The Eigenvalue and Eigenvector Problem

Given a tensor M:\mathbb{R}^n\rightarrow\mathbb{R}^n, we seek a nonzero vector p\in \mathbb{R}^n and a real number \lambda such that: Mp=\lambda p. This is equivalent to Mp-\lambda p=0\Rightarrow (M-\lambda I)p=0. In other words, the eigenvalue is a real number that makes the tensor M-\lambda I not invertible while the eigenvector is a non-zero vector p\in\ker(M-\lambda I). Considering the matrix representation of the tensor M, the eigenvalue is the solution to the following equation:

    \[ \det(M-\lambda I)=0 \]

The above equation is called the characteristic equation of the matrix M.
From the properties of the determinant function, the characteristic equation is an n^{th} degree polynomial of the unknown \lambda where n is the dimension of the underlying space.
In particular, \det(M-\lambda I)=a_n\lambda^n+a_{n-1}\lambda^{n-1}+\cdots +a_1\lambda+a_0, where \{a_i\}_{i=0}^n are called the polynomial coefficients. Thus, the solution to the characteristic equation abides by the following facts from polynomial functions:
Polynomial roots: A polynomial f(\lambda) has a root a if (\lambda-a) divides f(\lambda), i.e., \exists g(\lambda) such that f(\lambda)=g(\lambda)(\lambda-a).
– The fundamental theorem of Algebra states that a polynomial of degree n has n complex roots that are not necessarily distinct.
– The Complex Conjugate Root Theorem states that If a is a complex root of a polynomial with real coefficients, then the conjugate \overline{a} is also a complex root.

Therefore, the eigenvalues can either be real or complex numbers. If one eigenvalue is a real number, then there exists a vector with real valued components that is an eigenvector of the tensor. Otherwise, the only eigenvectors are complex eigenvectors which are elements of finite dimensional linear spaces over the field of complex numbers.

Graphical Representation of the Eigenvalues and Eigenvectors

The eigenvectors of a tensor M:\mathbb{R}^n\rightarrow\mathbb{R}^n are those vectors that do not change their direction upon transformation with the tensor M but their length is rather magnified or reduced by a factor \lambda. Notice that an eigenvalue can be negative (i.e., the transformed vector can have an opposite direction). Additionally, an eigenvalue can have the value of 0. In that case, the eigenvector is an element of the kernel of the tensor.

The following example illustrates this concept. Choose four entries for the matrix M:\mathbb{R}^2\rightarrow\mathbb{R}^2 and press evaluate.
The tool then draws 8 coloured vectors across the circle and their respective images across the ellipse. Use visual inspection to identify which vectors keep their original direction.
The tool also finds at most two eigenvectors (if they exist) and draws them in black along with their opposite directions. Use the tool to investigate the eigenvalues and eigenvectors of the following matrices:

    \[ \left(\begin{array}{cc} 1& 0\\ 0&1\end{array}\right)\hspace{10mm}\left(\begin{array}{cc} 1& 1\\ 0&1\end{array}\right)\hspace{10mm}\left(\begin{array}{cc} 0.4& 0.7\\-0.7&0.2\end{array}\right) \hspace{10mm}\left(\begin{array}{cc} 1& 2\\5&1\end{array}\right) \]

After inspection, you should have noticed that every vector is an eigenvector for the identity matrix I since \forall p\in\mathbb{R}^2:Ip=p, i.e., I possesses one eigenvalue which is 1 but all the vectors in \mathbb{R}^2 are possible eigenvectors.
You should also have noticed that some matrices don’t have any real eigenvalues, i.e., none of the vectors keep their direction after transformation. This is the case for the matrix:

    \[ M=\left(\begin{array}{cc} 0.4& 0.7\\-0.7&0.2\end{array}\right) \]

Additionally, the matrix:

    \[ N=\left(\begin{array}{cc} 1& 1\\0&1\end{array}\right) \]

has only one eigenvalue while any vector which is a multiplier of

    \[ p=\left(\begin{array}{cc} 1\\0\end{array}\right) \]

keeps its direction after transformation through the matrix N. You should also notice that some matrices will have negative eigenvalues. In that case, the corresponding eigenvector will be transformed into the direction opposite to its original direction. See for example, the matrix:

    \[ O=\left(\begin{array}{cc} 1& 2\\5&1\end{array}\right) \]

Leave a Reply

Your email address will not be published.