Processing math: 0%

Wednesday, 7 August 2024

Mathematical Biscuit I

 Can you find two irrational numbers a,b such that a^b is rational? Surprisingly, yes and the argument is very easy. 

If \sqrt{2}^{\sqrt{2}} is rational then we are done.

If \sqrt{2}^{\sqrt{2}} is irrational then (\sqrt{2}^{\sqrt{2}})^{\sqrt{2}}=\sqrt{2}^{\sqrt{2} \sqrt{2}}=\sqrt{2}^2=2 and we are done :)

PS: It turns out that \sqrt{2}^{\sqrt{2}} is indeed irrational (infact transcendental). This is a consequence of an advanced result called Gelfond-Schneider Theorem. However this was irrelevant to our argument.

Thursday, 1 August 2024

Ceva's Theorem

This is a classical result which is very often quite useful. Consider \triangle ABC and three points P,Q and R on BC, AC and AB respectively (segments of this form are called cevians). Then Ceva's theorem states that:
The three cevians AP, BQ and CR are concurrent (meet at a point) if and only if \frac{AR}{RB}\frac{BP}{PC}\frac{CQ}{QA}=1
The quantity \frac{AR}{RB}\frac{BP}{PC}\frac{CQ}{QA} is called Ceva's Ratio and it is determined upto reciprocal.
Convince yourself by playing with the following GeoGebra applet (use P, Q and R, vertices won't change the ratio)


 
Let us first assume the given cevians are concurrent (as shown below):


We will use the fact that ratio of areas of triangles with equal altitudes is equal to ratio of their bases. Area of \triangle XYZ will be denoted as XYZ.
\frac{ARC}{RBC}=\frac{AR}{RB}=\frac{ARO}{RBO}
\frac{AR}{RB}=\frac{ARC-ARO}{RBC-RBO}=\frac{ACO}{BCO}
Similarly \frac{BP}{PC}=\frac{BAO}{CAO} and \frac{CQ}{QA}=\frac{CBO}{ABO}. Thus,
\frac{AR}{RB}\frac{BP}{PC}\frac{CQ}{QA}=\frac{ACO}{BCO}\frac{BAO}{CAO}\frac{CBO}{ABO}=1
Conversely, assume \frac{AR}{RB}\frac{BP}{PC}\frac{CQ}{QA}=1. Suppose AP and BQ meet at O. Suppose CO intersect line AB at R' (R' lies on segment AB, for more details see below). Then from above, \frac{AR'}{R'B}\frac{BP}{PC}\frac{CQ}{QA}=1. Using the given hypothesis,  \frac{AR'}{R'B}\frac{BP}{PC}\frac{CQ}{QA}=\frac{AR}{RB}\frac{BP}{PC}\frac{CQ}{QA} ie. \frac{AR'}{R'B}=\frac{AR}{RB}. Since R and R' lie between A and B, by uniqueness of internal ratio we have R=R'.



PS: Some of the assertions like R' lying on segment AB, O being in interior of \triangle ABC or the uniqueness assertion which gives us R=R' need rigorous proof which can be found in George E Martin's book given in the reference. It is not an easy read, so beware.


References:
  • The foundations of geometry and the non-Euclidean plane by George Edward Martin
  • A Sequel to the First Six Books of Euclid by John Casey
  • David Joyce's web version of Euclid's Elements


Saturday, 20 July 2024

The Butterfly Theorem

This is a simple geometric result, first published in early 18th century. 

Consider chord AB in the given circle with midpoint M. If CD and EF are two more chords passing through M such that CF and ED meets AB at G and H respectively. Then M is also the midpoint of GH.
For our convenience, let AM = BM =a, GM=b and HM=c. Drop perpendiculars GG', GG'', HH' and HH'' on CM, FM, EM and DM respectively (and let their lengths be b',b'',c' and c'' resp.).  We have shown the construction below:

We now chase the similar triangles. Since \triangle MG'G \sim \triangle MH''H we have \frac{b}{c} = \frac{b'}{c''}. Similarly \frac{b}{c} = \frac{b''}{c'}.  Also note \triangle CG'G \sim \triangle EH'H and \triangle FG''G \sim \triangle DH''H. So \frac{b'}{c'} = \frac{CG}{EH} and \frac{b''}{c''} = \frac{FG}{DH}. So we have, \frac{b^2}{c^2}=\frac{b'b''}{c'c''}=\frac{CG \times FG}{DH \times EH}Intersecting chords theorem gives us CG \times FG = AG \times BG=(a-b)(a+b) and similarly DH \times EH = (a-c)(a+c). Thus we have \frac{b^2}{c^2}= \frac{a^2-b^2}{a^2-c^2} which easily gives us b=c.

References:
  • A Sequel to the First Six Books of Euclid by John Casey
  • David Joyce's web version of Euclid's Elements

Thursday, 6 June 2024

A Twist in Classical Proof of Infinitude of Primes

Most of us are familiar with the fact that there are infinitely many prime numbers. The classic argument of Euclid is as follows:

Consider a (nonempty) finite list of primes p_1, p_2 \ldots p_n. Since N := p_1p_2 \cdots p_n +1 > 1, there must a prime q dividing N and its easy to see that it is not in the given list.

Prime Factorization of integers plays an important role in the above argument  and this ultimately rests on Euclid's Lemma: if a,b are integers and p is a prime dividing ab then p divides either a or b.

Here is a modified version of classical proof. Choose any positive integer n. Note n and n+1 are coprime and so n(n+1) has at least two distinct prime factors. Similarly continue the argument with n(n+1) and n(n+1)+1 and so on.

Monday, 17 July 2023

Informal Introduction to Continuous functions

You may have heard people to describe the notion of continuous function in various ways:
  • the graph of such a function can be drawn without lifting the pen
  • there is no sudden jump in values
  • if we can be guarantee that change in output can be made as small as we please by making the change in input sufficiently small.
Although these are informal ways to talk about continuity of functions, it is a good way to visualize some well behaved functions. Mathematicians like Bolzano and Cauchy tried (and came pretty close) in giving a rigorous definition of continuity. Finally, Weierstrass succeeded in giving a satisfactory (and most commonly used) definition of a continuous function.
For simplicity, we shall assume that the domain of the real valued function is an interval (eg. (0,1), \mathbb{R}^+, [-1,1]) for simplicity. However, the definition is still valid for any nonempty subset of \mathbb{R}. Let f:I \to \mathbb{R} be a real valued function and let a \in I. Then f is said to be continuous at a if for any open interval V around f(a), there exist an open interval U around a, such that image set of U under f is contained in V. In logical notation:
\forall \epsilon > 0 \; \exists \delta > 0 \;  \forall x \in I \;( |x - a| < \delta \implies |f(x) - f(a)| < \epsilon)
A function f:I \to \mathbb{R} is said to be continuous if it is continuous at every a \in I. Since we are checking continuity of f at every point, this is sometimes referred as pointwise continuity. In logical notation,
\forall y  \in I \; \forall \epsilon > 0 \; \exists \delta > 0 \; \forall x \in I\; ( |x - y| < \delta \implies |f(x) - f(y)| < \epsilon)
Observe that \delta may depend on \epsilon and the point a (where we are checking continuity). In the case, where \delta is independent of point a, we say that f is uniformly continuous.
(the adjective continuous is reserved for pointwise continuity).

Let f:I \to \mathbb{R} be a continuous function and let a<b where a,b \in I. Suppose c is a real number lying between f(a) and f(b) (assume f(a) < c <f(b)). Now we collect all such numbers x \in [a,b] such that f(x)  < c. Obviously a belongs to this collection and b does not. This collection has a least upper bound (this comes from a fundamental property of \mathbb{R} called supremum property) which we call \alpha. Now, from continuity of f, if it is takes positive (or negative) value at a point, it takes positive (or negative) in a sufficiently small interval around that point (it is quite easy to prove this fact). We apply this to f(x) - c, to see that f(\alpha) > c and f(\alpha) < c contradicts the definition of \alpha. So, we showed that there exists \alpha \in (a,b) such that f(\alpha) = c.

This result is called the intermediate value theorem and the fact that I is an interval  played a role in the proof of this statement. This justifies the informal ideas of continuity.

References:
  • Calculus by M. Spivak





Wednesday, 12 July 2023

Naïve Set Theory and Paradoxes

A naive approach to set theory creates a lot of foundational problems, famous examples being paradoxes of Russell, Cantor and Burali-Forti. Discovery of these paradoxes implied that Cantor's original formulation of set theory was inconsistent. This article will introduce the aforementioned paradoxes.

Russell's paradox is closely related to the notion of a universal set. It can be stated as follows: Let R be the set of all sets that are not members of themselves. Is R member of itself? It doesn't matter if we assume R \in R or R \not \in R, we reach contradiction either way. In logical notation:

   Let R = \{r| r \not \in r \}. Then R \in R \iff R \not \in R 

To understand Cantor's paradox we need to understand the notion of a cardinal. We say that two sets are equinumerous if there exists a bijection between them. It is easily shown that equinumerousity is an equivalence relation. So we can pick out a representative from each equivalence class and we shall call them cardinals. Let |X| denote the cardinal associated with X. Let \chi denote the set of all cardinals. We will show that there exist a cardinal not belonging to \chi which is a contradiction.

For that purpose, we give a partial order on \chi as follows: \mathcal{A} \leq \mathcal{B} iff here exist an injection from \mathcal{A} to \mathcal{B}. In particular, \mathcal{A} < \mathcal{B} means that there is an injection f:\mathcal{A} \to \mathcal{B} but there is no bijection between them. It is fairly easy to check that \leq is a reflexive and transitive. Antisymmetry follows from Schröder–Bernstein theorem. There is a trivial injection between a set X and its powerset \mathcal{P}(X) namely x \mapsto \{ x\}. However Cantor proved that there is no bijection between a set and its power set. So |X| < |\mathcal{P}(X)|. (Note if A \subseteq B then |A| \leq |B|.)

Let \mathcal{C} be the cardinal associated with union of all cardinals ie. \mathcal{C} = |\bigcup \chi|. Noting that every cardinal is a subset of \bigcup \chi and that inclusion maps are injective, we have \mathcal{A} \leq \mathcal{C}, for all \mathcal{A} \in \chi. However \mathcal{C} < |\mathcal{P}(\mathcal{C})| and so |\mathcal{P}(\mathcal{C})| \not \in \chi which is absurd.

Burali-Forti paradox deals with ordinals. We are interested in partially ordered set (W, \leq) where every nonempty subset of W has a least element. These are called well ordered sets (or wosets). Given two partial orders (W_1, \leq_1) and (W_2, \leq_2), a function f:W_1 \to W_2 is called an order-isomorphism if f is a bijection and preserves order in both direction ie. a \leq_1 b iff f(a) \leq_2 f(b) and in case such a function exists, we say (W_1, \leq_1) is isomorphic to (W_2, \leq_2). It is easy to check that this is an equivalence relation. Also note the property of "well order" is preserved by isomorphism. So from each equivalence class whose members are wosets, we can pick a representative whom we call an ordinal. Let \Omega denote the set of all ordinals.

A classical result states that given two wosets (W_1, \leq_1) and (W_2, \leq_2), exactly one of the following holds: (this is known as trichotomy of ordinals)

  • (W_1, \leq_1) isomorphic to  (W_2, \leq_2)
  • (W_1, \leq_1) isomorphic to a proper initial segment of (W_2, \leq_2)
  • (W_2, \leq_2) isomorphic to a proper initial segment of (W_1, \leq_1)

So, we can define a partial order \preceq on \Omega in a natural way. With some work, we can show that (\Omega, \preceq) is a woset. So there is an ordinal \omega associated with (\Omega, \preceq). Cantor had proved that any ordinal \beta is the ordinal associated with the set \{\alpha \in \Omega | \alpha \prec \beta \} well ordered using \preceq. So, in particular the proper subset \{\alpha \in \Omega | \alpha \prec \omega \} of \Omega is order isomorphic to \omega and hence isomorphic to \Omega which contradicts the trichotomy theorem.

Cantor's theory of sets consisted of laws of first order logic, axiom of extensionality (which states that two sets are equal iff they have the same elements) and axiom schema of comprehension (which states that given any logical property, we can construct a set containing precisely those elements satisfying the given logical  property) which turns out be the source of all these paradoxes. Standard theories like ZF set theory prevent these paradoxes by weakening the comprehension axiom, so we cannot construct universal set, set R described in Russell's paradox, set of all  cardinals \chi or set of all ordinals \Omega.

PS: This is an unreleased draft of an article which is probably gonna appear in CMIT's Donut. However, I have to cut down some words in final version.

References:
  • Foundations of Mathematics by Kenneth Kunen
  • Naive Set Theory by Paul Halmos

Thursday, 29 June 2023

The Nine Point Circle

The nine point circle (also known as Feuerbach's circle) is one of the most interesting topics in elementary geometry. It doesn't require much prerequisites and so this post can be read by a high school student who has some idea about geometry of circle and triangle.

Just a reminder: All three altitudes of a triangle meet at a single point, which we call orthocenter and a set of points is said to be concyclic if they all lie in a circle.






Consider \triangle ABC and let O be the orthocenter (shown in green). Let A_a be the feet of altitude from vertex A on side BC, A_m be the midpoint of side BC and A_o be the midpoint of line segment AO (we have used the colors red, blue and orange respectively for these points). Similar notations corresponding to vertex B and C. Then the nine points (which sometimes may not be distinct)  A_o, B_o, C_o (midpoints between orthocenter and vertices) A_m, B_m, C_m (midpoints of sides) and A_a, B_a, C_a (feet of altitudes) lie in a same circle which is commonly called the nine point circle (shown with dotted circle).

Try interacting with the GeoGebra applet given below (\Omega is the center of nine point circle):


To prove this we need some elementary results:

  • A convex quadrilateral is cyclic (ie. its vertices are concyclic) if and only if its opposite angles are supplementary (ie. 180^{\circ}). One half of this result is Euclid's proposition 22 in Book III.
  • Line segment formed by connecting the midpoints of two sides of a triangle will be parallel to the third side and have half of its length. This called midpoint theorem.
  • The center of the circumcircle of a right triangle lies on its hypotenuse. This is converse of a famous result called Thales' theorem.
Let us start by constructing line segments C_mB_m, B_oC_o, C_mB_o and C_oB_m.




Now in \triangle OBC, we use midpoint theorem to conclude B_oC_o is parallel to BC. Similarly in \triangle ABC we get C_mB_m is parallel to BC. Similarly by considering \triangle AOC and \triangle AOB, we get C_mB_o and C_oB_m are parallel (since both are parallel to AO by midpoint theorem). Since line AO (ie line AA_a is perpendicular to BC, we get C_oB_m is perpendicular to C_mB_m. So, \square C_mB_mC_oB_o is a rectangle and in particular is a cyclic quadrilateral. Moreover, by converse of Thales' theorem, C_mC_o and B_mB_o are the diameters of the circle passing through vertices of \square C_mB_mC_oB_o. Again by converse of Thales' theorem, C_a and B_a also lie on this circle.

We have proved that C_m,B_m,C_o,B_o,C_a, B_a lie on a circle with diameters C_mC_o and B_mB_o. Similarly, we can show A_m,B_m,A_o,B_o,A_a, B_a lie on a circle with diameters A_mA_o and B_mB_o and A_m,C_m,A_o,C_o,A_a, C_a lie on a circle with diameters A_mA_o and C_mC_o. Looking at the diameters of these three circles, we see that any two of them has a common diameter. So these three circles must coincide and is indeed the required nine point circle.

References:
  •  A Sequel to the First Six Books of Euclid by John Casey
  • David Joyce's web version of Euclid's Elements



Sunday, 25 June 2023

Riemann Sphere and Stereographic Projection

Sometimes while studying complex valued functions, it is sometimes useful to consider the extended complex plane \mathbb{C} \cup \{ \infty \}, where we append an additional point \infty to the set of usual complex numbers. This is especially helpful in study of Mobius maps z \mapsto \frac{az+b}{cz+d} where ad-bc \not = 0.

Riemann sphere provides us with a model of extended complex plane. This makes it in useful in complex analysis because it allows us to make sense of division by zero in some circumstance (\frac{0}{0} is still undefined).

We start with the unit sphere S^2 in \mathbb{R}^3 ie. the spherical surface with center at origin and unit radius (set of points (x,y,z) given by x^2 +y^2 +z^2=1). First, we shall show there is a bijection between the R^2 and S^2 minus a single point. (We identity xy-plane with \mathbb{R}^2)

Start with a point P on S^2 other than the north pole N = (0,0,1). Draw a ray starting from N passing through P. This ray will intersect the xy-plane at a unique point Q. The map that takes P to Q is called as stereographic projection and is denoted as \Pi.


We shall use some properties of vectors to derive stereographic projection explicitly. Q - N is parallel to P - N, so P - N = k(Q - N) for some scalar k. Denote P=(x,y,z) and Q=(u,v,0). Then, we get (x,y,z) = (ku,kv,1-k) so that k=1-z. Note z \not = 1 since P \not = N.

\therefore \Pi (x,y,z) = (u,v,0) = \left( \frac{x}{1-z}, \frac{y}{1-z}, 0 \right)

Since x^2 + y^2 + z^2 = 1, we have k^2u^2 + k^2v^2 + (1-k)^2 = 1. Simplifying (while noting k \not = 0 since z \not = 1),

k = \frac{2}{u^2+v^2 +1}

\therefore P = (x,y,z) = \left(  \frac{2u}{u^2+v^2 +1}, \frac{2v}{u^2+v^2 +1}, \frac{u^2 +v^2 -1}{u^2+v^2 +1} \right)

It takes some simple calculations to verify that \Pi : S^2 \setminus \{ N \} \to \mathbb{R}^2 is a bijection and the map (u,v,0) \mapsto \left(  \frac{2u}{u^2+v^2 +1}, \frac{2v}{u^2+v^2 +1}, \frac{u^2 +v^2 -1}{u^2+v^2 +1} \right) is its inverse.


Interact with GeoGebra applet to see how stereographic projection actually works.

In fact our map \Pi is conformal (ie. it preserves angles) but proof of this assertion requires some tools from differential geometry.

We can extend \Pi to the whole sphere S^2 by simply defining \Pi (N) = \infty. Here, \infty is just a symbol. And similarly extend \Pi^{-1} :  \mathbb{R}^2 \cup \{ \infty \} \mapsto S^2. Since, we can interpret \mathbb{C} as  \mathbb{R}^2 (from a topological point of view), we can also define \Pi^{-1} :  \mathbb{C} \cup \{ \infty \} \to S^2 as z \mapsto \left(  \frac{2 \Re(z)}{|z|^2 +1}, \frac{2 \Im(z)}{|z|^2 +1}, \frac{|z|^2 -1}{|z|^2 +1} \right)

where \Re(z) and \Im(z) are real and imaginary part of z respectively, and

\infty \mapsto (0,0,1)

PS: From a topological point of view, we can define a topology \tau_1 on \mathbb{C} \cup \{\infty\} such that our usual topology \tau_2 on \mathbb{C} is a subspace of  \tau_1 and \tau_1 is one point compactification of \tau_2.

References:

  • A Pathway to Complex Analysis by S. Kumaresan
  • Topology by J. Munkres
  • A Comprehensive introduction to Differential Geometry Vol. 2 by M. Spivak










Wednesday, 21 June 2023

What is a matrix?

The following post is mostly aimed at high school students interested in mathematics (and even budding undergrads). We have learned in our schools that a matrix is a rectangular array of numbers. But is that all?

In physics we have seen vector quantities like force, displacement etc. A vector like 2i+4j+7k can also be represented by ordered triple (2,4,7). We can generalize to ordered n-tuples and talk about vectors in n^{th} dimensional space. We consider the set of all ordered n-tuples of real numbers \mathbb{R}^n (each individual n-tuple will be referred as a vector). Let X = \left( x_1, x_2,...,x_n \right), Y = \left( y_1, y_2,...,y_n \right) and k be a real number (also called a scalar). Then, X+Y defined as the vector \left( x_1 + y_1, x_2 + y_2,...,x_n + y_n \right) and k \cdot X = \left( kx_1, kx_2,...,kx_n \right). The operations + and \cdot are called vector addition and scalar multiplication respectively. We usually write k \cdot X as kX. For n=2 and 3, this coincides with the usual notion of vectors in classical physics and for n=1, it is the usual addition and multiplication of real numbers.

Some properties which we can easily verify are:
  • X+Y = Y+X
  • X+(Y+Z) = (X+Y)+Z
  • There exist \textbf{0} \in \mathbb{R}^n such that for all X \in \mathbb{R}^n, X +\textbf{0} = X. Just choose \textbf{0} = \left( 0, 0,...,0 \right).
  • For all X \in \mathbb{R}^n there exist X' \in  \mathbb{R}^n such that X+X'=\textbf{0}. Take X' = (-1)X and it is not hard to see that this is the only possible choice for X'.
  • (k_1k_2) X = k_1 (k_2  X)
  • (k_1 + k_2)  X = k_1  X + k_2 X
  • k (X +Y) = k  X + k  Y
  • 0X = \textbf{0} and 1X = X
It is easy to see that every vector X in \mathbb{R}^n can be represent uniquely in the form \sum_{i=1}^{n} x_i E_i where X = \left( x_1, x_2,...,x_n \right) and E_i is the n-tuple whose i^{th} entry is 1 and 0 elsewhere. These E_i's are said to form a basis for \mathbb{R}^n. In case of confusion we shall denote them as E_i^n to indicate the dimension of the space.

We say that a function T: \mathbb{R}^n \to \mathbb{R}^m is linear if for all X, Y \in \mathbb{R}^n and k \in \mathbb{R}, T(kX +Y) = kT(X) + T(Y).
Can you think of some examples of linear maps. Since T(X) = \sum_{i=1}^{n} x_i T(E_i), values at basis elements E_i, i = 1,2,...,n determine the linear map. Morever, if you wish to define a linear map T satisfying T(E_i) = F_i, i=1,2,...,n, define T(X) =  \sum_{i=1}^{n} x_i F_i. It is not hard to verify that this map is well defined and linear. (Try this out! Read the paragraph on basis again). If T: \mathbb{R}^n \to \mathbb{R}^m and S: \mathbb{R}^m \to \mathbb{R}^l are maps, we can define the composition S \circ T: \mathbb{R}^n \to \mathbb{R}^l as the map S \circ T(X) = S(T(X)). It is an easy exercise to verify that if T and S are linear, then S \circ T is linear. (We usually write ST for S \circ T). If  T_1: \mathbb{R}^n \to \mathbb{R}^m and T_2: \mathbb{R}^n \to \mathbb{R}^m are maps, then we can define T_1 + T_2:\mathbb{R}^n \to \mathbb{R}^m as (T_1 + T_2)(X) = T_1(X) + T_2(X). If T_1 and T_2 are linear, so is T_1 + T_2.

In previous paragraph, we saw how values of a linear map T at basis elements completely determine the map and how to construct a linear map with given values at basis elements (and this construction is unique by first statement). We can interpret elements of R^n as column vectors ie. n \times 1 matrices. Let  T: \mathbb{R}^n \to \mathbb{R}^m is linear. Now, form a m \times n matrix with its i^{th} column being T(E_i^n) (since T(E_i^n) \in \mathbb{R}^m, it is interpreted as an m \times 1 matrix). We denote this matrix as \left[T \right]. Procedure for constructing a linear map with given values at basis elements tells is that every m \times n matrix is of the form \left[T \right] for some linear map T: \mathbb{R}^n \to \mathbb{R}^m. So there is a bijective (one to one and onto) correspondence between set of all linear functions from \mathbb{R}^n to \mathbb{R}^m (denoted as \mathcal{L} \left( \mathbb{R}^n , \mathbb{R}^m \right)) and set of all m \times n matrices (denoted as \mathcal{M}_{m \times n} (\mathbb{R})).

Let A be an m \times n matrix. We denote the (i,j)^{th} entry (ie. number in i^{th} row and j^{th} column) of A by A_{ij}. Recall that if A is an m \times n and B is an n \times p matrix, then we can define the matrix product AB by (AB)_{ij} = \sum_{k=1}^{n} A_{ik}B_{kj}. Usual properties like commutativity need not hold ie. It is not always true that AB = BA. However associativity still holds ie. If  A is an m \times n, B is an n \times p and C is an p \times q matrix, then A(BC) = (AB)C still holds. (It requires little work). Distributive property is also valid ie. A(B+C) = AB +AC and (A+B)C = AC + BC, whenever they are defined. (In addition of matrices, we add entrywise ie. (A+B)_{ij} = A_{ij} + B_{ij}).

Let us look at the matrix \left[T \right] in more detail. Denote \left[T \right]_{ij} by t_{ij}. From our above discussion, we see that T(E_j^n) = \sum_{i=1}^{m} t_{ij}E_i^m. Let X \in \mathbb{R}^n as given previously. We shall compute \left[T \right]X (the product will give a m \times 1 matrix). (\left[T \right]X)_{i1} =  \sum_{k=1}^{n} t_{ik}X_{k1} = \sum_{k=1}^{n} t_{ik}x_k
Now, T(X) = \sum_{k=1}^{n} x_k T(E_k^n) = \sum_{k=1}^{n} x_k \left( \sum_{i=1}^{m} t_{ik}E_i^m \right) =  \sum_{i=1}^{m} \left( \sum_{k=1}^{n} t_{ik}x_k \right) E_i^m
\therefore T(X) = \sum_{i=1}^{m}( \left[T \right]X)_{i1}E_i^m
Thus \left[T \right]X gives T(X) in column vector form.

It is easy to verify that \left[ T_1 + T_2 \right] =  \left[T_1 \right] +  \left[T_1 \right]. But it takes slightly more work to show that \left[ST \right] =  \left[S \right] \left[T \right] (take it as a challenging exercise, it is similar to the calculation we have done above). Note that the identity function id_n:\mathbb{R}^n \to \mathbb{R}^n \in \mathcal{L} \left( \mathbb{R}^n , \mathbb{R}^n \right) and \left[id_n \right] is the n \times n identity matrix I_n. The zero map \mathbb{0}:\mathbb{R}^n \to \mathbb{R}^m given by \mathbb{0}(X) = \textbf{0} is linear and unsurprisingly \left[ 0 \right] is the m \times n matrix with all entries 0.

We have seen there is a deep connection between linear maps and matrices. The correspondence T \mapsto \left[T \right] is a bijection \mathcal{L} \left( \mathbb{R}^n , \mathbb{R}^m \right) \to \mathcal{M}_{m \times n} (\mathbb{R}) that preserves structure (the technical term is isomorphism). In conclusion, matrices are precisely linear maps.

PS: We worked with set of real numbers \mathbb{R}. We could also work with \mathbb{Q} or \mathbb{C} (ie. set of rationals and complex numbers respectively). Instead of \mathbb{R}^n or \mathbb{C}^n, the same reasoning applies when we work with algebraic structures called finite dimensional vector spaces. Pick up a book on linear algebra to learn more. Hope this was a good motivation to study matrices.

References:

  • Linear Algebra 4^{th} ed. by Friedberg, Insel and Spence
  • Linear Algebra Done Right 3^{rd} ed. by Axler
  • What Is Mathematics? by Courant and Robbins




Mathematical Biscuit I

 Can you find two irrational numbers a,b such that a^b is rational? Surprisingly, yes and the argument is very easy.  If $\sqrt{2}^{\sq...