Bunch of Thoughts

Wednesday, 7 August 2024

Mathematical Biscuit I

Can you find two irrational numbers $a,b$ such that $a^b$ is rational? Surprisingly, yes and the argument is very easy.

If $\sqrt{2}^{\sqrt{2}}$ is rational then we are done.

If $\sqrt{2}^{\sqrt{2}}$ is irrational then $$(\sqrt{2}^{\sqrt{2}})^{\sqrt{2}}=\sqrt{2}^{\sqrt{2} \sqrt{2}}=\sqrt{2}^2=2$$ and we are done :)

PS: It turns out that $\sqrt{2}^{\sqrt{2}}$ is indeed irrational (infact transcendental). This is a consequence of an advanced result called Gelfond-Schneider Theorem. However this was irrelevant to our argument.

Thursday, 1 August 2024

Ceva's Theorem

This is a classical result which is very often quite useful. Consider $\triangle ABC$ and three points $P,Q$ and $R$ on $BC, AC$ and $AB$ respectively (segments of this form are called cevians). Then Ceva's theorem states that:

The three cevians $AP, BQ$ and $CR$ are concurrent (meet at a point) if and only if $$\frac{AR}{RB}\frac{BP}{PC}\frac{CQ}{QA}=1$$

The quantity $\frac{AR}{RB}\frac{BP}{PC}\frac{CQ}{QA}$ is called Ceva's Ratio and it is determined upto reciprocal.

Convince yourself by playing with the following GeoGebra applet (use $P, Q$ and $R$, vertices won't change the ratio)

Let us first assume the given cevians are concurrent (as shown below):

We will use the fact that ratio of areas of triangles with equal altitudes is equal to ratio of their bases. Area of $\triangle XYZ$ will be denoted as $XYZ$.

$$\frac{ARC}{RBC}=\frac{AR}{RB}=\frac{ARO}{RBO}$$

Using properties of ratio:

$$\frac{AR}{RB}=\frac{ARC-ARO}{RBC-RBO}=\frac{ACO}{BCO}$$

Similarly $\frac{BP}{PC}=\frac{BAO}{CAO}$ and $\frac{CQ}{QA}=\frac{CBO}{ABO}$. Thus,

$$\frac{AR}{RB}\frac{BP}{PC}\frac{CQ}{QA}=\frac{ACO}{BCO}\frac{BAO}{CAO}\frac{CBO}{ABO}=1$$

Conversely, assume $\frac{AR}{RB}\frac{BP}{PC}\frac{CQ}{QA}=1$. Suppose $AP$ and $BQ$ meet at $O$. Suppose $CO$ intersect line $AB$ at $R'$ ($R'$ lies on segment $AB$, for more details see below). Then from above, $\frac{AR'}{R'B}\frac{BP}{PC}\frac{CQ}{QA}=1$. Using the given hypothesis, $\frac{AR'}{R'B}\frac{BP}{PC}\frac{CQ}{QA}=\frac{AR}{RB}\frac{BP}{PC}\frac{CQ}{QA}$ ie. $\frac{AR'}{R'B}=\frac{AR}{RB}$. Since $R$ and $R'$ lie between $A$ and $B$, by uniqueness of internal ratio we have $R=R'$.

PS: Some of the assertions like $R'$ lying on segment $AB$, $O$ being in interior of $\triangle ABC$ or the uniqueness assertion which gives us $R=R'$ need rigorous proof which can be found in George E Martin's book given in the reference. It is not an easy read, so beware.

References:

The foundations of geometry and the non-Euclidean plane by George Edward Martin
A Sequel to the First Six Books of Euclid by John Casey
David Joyce's web version of Euclid's Elements

Saturday, 20 July 2024

The Butterfly Theorem

This is a simple geometric result, first published in early 18th century.

Consider chord $AB$ in the given circle with midpoint $M$. If $CD$ and $EF$ are two more chords passing through $M$ such that $CF$ and $ED$ meets $AB$ at $G$ and $H$ respectively. Then $M$ is also the midpoint of $GH$.

For our convenience, let $AM = BM =a$, $GM=b$ and $HM=c$. Drop perpendiculars $GG'$, $GG''$, $HH'$ and $HH''$ on $CM, FM, EM$ and $DM$ respectively (and let their lengths be $b',b'',c'$ and $c''$ resp.). We have shown the construction below:

We now chase the similar triangles. Since $\triangle MG'G \sim \triangle MH''H$ we have $\frac{b}{c} = \frac{b'}{c''}$. Similarly $ \frac{b}{c} = \frac{b''}{c'}$. Also note $\triangle CG'G \sim \triangle EH'H$ and $\triangle FG''G \sim \triangle DH''H$. So $\frac{b'}{c'} = \frac{CG}{EH}$ and $\frac{b''}{c''} = \frac{FG}{DH}$. So we have, $$\frac{b^2}{c^2}=\frac{b'b''}{c'c''}=\frac{CG \times FG}{DH \times EH}$$Intersecting chords theorem gives us $CG \times FG = AG \times BG=(a-b)(a+b)$ and similarly $DH \times EH = (a-c)(a+c)$. Thus we have $\frac{b^2}{c^2}= \frac{a^2-b^2}{a^2-c^2}$ which easily gives us $b=c$.

References:

A Sequel to the First Six Books of Euclid by John Casey
David Joyce's web version of Euclid's Elements

Thursday, 6 June 2024

A Twist in Classical Proof of Infinitude of Primes

Most of us are familiar with the fact that there are infinitely many prime numbers. The classic argument of Euclid is as follows:

Consider a (nonempty) finite list of primes $p_1, p_2 \ldots p_n$. Since $N := p_1p_2 \cdots p_n +1 > 1$, there must a prime $q$ dividing $N$ and its easy to see that it is not in the given list.

Prime Factorization of integers plays an important role in the above argument and this ultimately rests on Euclid's Lemma: if $a,b$ are integers and $p$ is a prime dividing $ab$ then $p$ divides either $a$ or $b$.

Here is a modified version of classical proof. Choose any positive integer $n$. Note $n$ and $n+1$ are coprime and so $n(n+1)$ has at least two distinct prime factors. Similarly continue the argument with $n(n+1)$ and $n(n+1)+1$ and so on.

Monday, 17 July 2023

Informal Introduction to Continuous functions

You may have heard people to describe the notion of continuous function in various ways:

the graph of such a function can be drawn without lifting the pen
there is no sudden jump in values
if we can be guarantee that change in output can be made as small as we please by making the change in input sufficiently small.

Although these are informal ways to talk about continuity of functions, it is a good way to visualize some well behaved functions. Mathematicians like Bolzano and Cauchy tried (and came pretty close) in giving a rigorous definition of continuity. Finally, Weierstrass succeeded in giving a satisfactory (and most commonly used) definition of a continuous function.

For simplicity, we shall assume that the domain of the real valued function is an interval (eg. $(0,1)$, $\mathbb{R}^+$, $[-1,1]$) for simplicity. However, the definition is still valid for any nonempty subset of $\mathbb{R}$. Let $f:I \to \mathbb{R}$ be a real valued function and let $a \in I$. Then $f$ is said to be continuous at $a$ if for any open interval $V$ around $f(a)$, there exist an open interval $U$ around $a$, such that image set of $U$ under $f$ is contained in $V$. In logical notation:

$$ \forall \epsilon > 0 \; \exists \delta > 0 \; \forall x \in I \;( |x - a| < \delta \implies |f(x) - f(a)| < \epsilon)$$

A function $f:I \to \mathbb{R}$ is said to be continuous if it is continuous at every $a \in I$. Since we are checking continuity of $f$ at every point, this is sometimes referred as pointwise continuity. In logical notation,

$$\forall y \in I \; \forall \epsilon > 0 \; \exists \delta > 0 \; \forall x \in I\; ( |x - y| < \delta \implies |f(x) - f(y)| < \epsilon)$$

Observe that $\delta$ may depend on $\epsilon$ and the point $a$ (where we are checking continuity). In the case, where $\delta$ is independent of point $a$, we say that $f$ is uniformly continuous.

(the adjective continuous is reserved for pointwise continuity).

Let $f:I \to \mathbb{R}$ be a continuous function and let $a<b$ where $a,b \in I$. Suppose $c$ is a real number lying between $f(a)$ and $f(b)$ (assume $f(a) < c <f(b)$). Now we collect all such numbers $x \in [a,b]$ such that $f(x) < c$. Obviously $a$ belongs to this collection and $b$ does not. This collection has a least upper bound (this comes from a fundamental property of $\mathbb{R}$ called supremum property) which we call $\alpha$. Now, from continuity of $f$, if it is takes positive (or negative) value at a point, it takes positive (or negative) in a sufficiently small interval around that point (it is quite easy to prove this fact). We apply this to $f(x) - c$, to see that $f(\alpha) > c$ and $f(\alpha) < c$ contradicts the definition of $\alpha$. So, we showed that there exists $\alpha \in (a,b)$ such that $f(\alpha) = c$.

This result is called the intermediate value theorem and the fact that $I$ is an interval played a role in the proof of this statement. This justifies the informal ideas of continuity.

References:

Calculus by M. Spivak

Wednesday, 12 July 2023

Naïve Set Theory and Paradoxes

A naive approach to set theory creates a lot of foundational problems, famous examples being paradoxes of Russell, Cantor and Burali-Forti. Discovery of these paradoxes implied that Cantor's original formulation of set theory was inconsistent. This article will introduce the aforementioned paradoxes.

Russell's paradox is closely related to the notion of a universal set. It can be stated as follows: Let $R$ be the set of all sets that are not members of themselves. Is $R$ member of itself? It doesn't matter if we assume $R \in R$ or $R \not \in R$, we reach contradiction either way. In logical notation:

Let $R = \{r| r \not \in r \}$. Then $R \in R \iff R \not \in R$

To understand Cantor's paradox we need to understand the notion of a cardinal. We say that two sets are equinumerous if there exists a bijection between them. It is easily shown that equinumerousity is an equivalence relation. So we can pick out a representative from each equivalence class and we shall call them cardinals. Let $|X|$ denote the cardinal associated with $X$. Let $\chi$ denote the set of all cardinals. We will show that there exist a cardinal not belonging to $\chi$ which is a contradiction.

For that purpose, we give a partial order on $\chi$ as follows: $\mathcal{A} \leq \mathcal{B}$ iff here exist an injection from $\mathcal{A}$ to $\mathcal{B}$. In particular, $\mathcal{A} < \mathcal{B}$ means that there is an injection $f:\mathcal{A} \to \mathcal{B}$ but there is no bijection between them. It is fairly easy to check that $\leq$ is a reflexive and transitive. Antisymmetry follows from Schröder–Bernstein theorem. There is a trivial injection between a set $X$ and its powerset $\mathcal{P}(X)$ namely $x \mapsto \{ x\}$. However Cantor proved that there is no bijection between a set and its power set. So $|X| < |\mathcal{P}(X)|$. (Note if $A \subseteq B$ then $|A| \leq |B|$.)

Let $\mathcal{C}$ be the cardinal associated with union of all cardinals ie. $\mathcal{C} = |\bigcup \chi|$. Noting that every cardinal is a subset of $\bigcup \chi$ and that inclusion maps are injective, we have $\mathcal{A} \leq \mathcal{C}$, for all $ \mathcal{A} \in \chi$. However $\mathcal{C} < |\mathcal{P}(\mathcal{C})|$ and so $|\mathcal{P}(\mathcal{C})| \not \in \chi$ which is absurd.

Burali-Forti paradox deals with ordinals. We are interested in partially ordered set $(W, \leq)$ where every nonempty subset of $W$ has a least element. These are called well ordered sets (or wosets). Given two partial orders $(W_1, \leq_1)$ and $(W_2, \leq_2)$, a function $f:W_1 \to W_2$ is called an order-isomorphism if $f$ is a bijection and preserves order in both direction ie. $a \leq_1 b$ iff $f(a) \leq_2 f(b)$ and in case such a function exists, we say $(W_1, \leq_1)$ is isomorphic to $(W_2, \leq_2)$. It is easy to check that this is an equivalence relation. Also note the property of "well order" is preserved by isomorphism. So from each equivalence class whose members are wosets, we can pick a representative whom we call an ordinal. Let $\Omega$ denote the set of all ordinals.

A classical result states that given two wosets $(W_1, \leq_1)$ and $(W_2, \leq_2)$, exactly one of the following holds: (this is known as trichotomy of ordinals)

$(W_1, \leq_1)$ isomorphic to $(W_2, \leq_2)$
$(W_1, \leq_1)$ isomorphic to a proper initial segment of $(W_2, \leq_2)$
$(W_2, \leq_2)$ isomorphic to a proper initial segment of $(W_1, \leq_1)$

So, we can define a partial order $\preceq$ on $\Omega$ in a natural way. With some work, we can show that $(\Omega, \preceq)$ is a woset. So there is an ordinal $\omega$ associated with $(\Omega, \preceq)$. Cantor had proved that any ordinal $\beta$ is the ordinal associated with the set $\{\alpha \in \Omega | \alpha \prec \beta \}$ well ordered using $\preceq$. So, in particular the proper subset $\{\alpha \in \Omega | \alpha \prec \omega \}$ of $\Omega$ is order isomorphic to $\omega$ and hence isomorphic to $\Omega$ which contradicts the trichotomy theorem.

Cantor's theory of sets consisted of laws of first order logic, axiom of extensionality (which states that two sets are equal iff they have the same elements) and axiom schema of comprehension (which states that given any logical property, we can construct a set containing precisely those elements satisfying the given logical property) which turns out be the source of all these paradoxes. Standard theories like ZF set theory prevent these paradoxes by weakening the comprehension axiom, so we cannot construct universal set, set $R$ described in Russell's paradox, set of all cardinals $\chi$ or set of all ordinals $\Omega$.

PS: This is an unreleased draft of an article which is probably gonna appear in CMIT's Donut. However, I have to cut down some words in final version.

References:

Foundations of Mathematics by Kenneth Kunen
Naive Set Theory by Paul Halmos

Thursday, 29 June 2023

The Nine Point Circle

The nine point circle (also known as Feuerbach's circle) is one of the most interesting topics in elementary geometry. It doesn't require much prerequisites and so this post can be read by a high school student who has some idea about geometry of circle and triangle.

Just a reminder: All three altitudes of a triangle meet at a single point, which we call orthocenter and a set of points is said to be concyclic if they all lie in a circle.

Consider $\triangle ABC$ and let $O$ be the orthocenter (shown in green). Let $A_a$ be the feet of altitude from vertex $A$ on side $BC$, $A_m$ be the midpoint of side $BC$ and $A_o$ be the midpoint of line segment $AO$ (we have used the colors red, blue and orange respectively for these points). Similar notations corresponding to vertex $B$ and $C$. Then the nine points (which sometimes may not be distinct) $A_o, B_o, C_o$ (midpoints between orthocenter and vertices) $A_m, B_m, C_m$ (midpoints of sides) and $A_a, B_a, C_a$ (feet of altitudes) lie in a same circle which is commonly called the nine point circle (shown with dotted circle).

Try interacting with the GeoGebra applet given below ($\Omega$ is the center of nine point circle):

To prove this we need some elementary results:

A convex quadrilateral is cyclic (ie. its vertices are concyclic) if and only if its opposite angles are supplementary (ie. $180^{\circ}$). One half of this result is Euclid's proposition $22$ in Book III.
Line segment formed by connecting the midpoints of two sides of a triangle will be parallel to the third side and have half of its length. This called midpoint theorem.
The center of the circumcircle of a right triangle lies on its hypotenuse. This is converse of a famous result called Thales' theorem.

Let us start by constructing line segments $C_mB_m$, $B_oC_o$, $C_mB_o$ and $C_oB_m$.

Now in $\triangle OBC$, we use midpoint theorem to conclude $B_oC_o$ is parallel to $BC$. Similarly in $\triangle ABC$ we get $C_mB_m$ is parallel to $BC$. Similarly by considering $\triangle AOC$ and $\triangle AOB$, we get $C_mB_o$ and $C_oB_m$ are parallel (since both are parallel to $AO$ by midpoint theorem). Since line $AO$ (ie line $AA_a$ is perpendicular to $BC$, we get $C_oB_m$ is perpendicular to $C_mB_m$. So, $\square C_mB_mC_oB_o$ is a rectangle and in particular is a cyclic quadrilateral. Moreover, by converse of Thales' theorem, $C_mC_o$ and $B_mB_o$ are the diameters of the circle passing through vertices of $\square C_mB_mC_oB_o$. Again by converse of Thales' theorem, $C_a$ and $B_a$ also lie on this circle.

We have proved that $C_m,B_m,C_o,B_o,C_a, B_a$ lie on a circle with diameters $C_mC_o$ and $B_mB_o$. Similarly, we can show $A_m,B_m,A_o,B_o,A_a, B_a$ lie on a circle with diameters $A_mA_o$ and $B_mB_o$ and $A_m,C_m,A_o,C_o,A_a, C_a$ lie on a circle with diameters $A_mA_o$ and $C_mC_o$. Looking at the diameters of these three circles, we see that any two of them has a common diameter. So these three circles must coincide and is indeed the required nine point circle.

References:

A Sequel to the First Six Books of Euclid by John Casey
David Joyce's web version of Euclid's Elements

Sunday, 25 June 2023

Riemann Sphere and Stereographic Projection

Sometimes while studying complex valued functions, it is sometimes useful to consider the extended complex plane $\mathbb{C} \cup \{ \infty \}$, where we append an additional point $\infty$ to the set of usual complex numbers. This is especially helpful in study of Mobius maps $z \mapsto \frac{az+b}{cz+d}$ where $ad-bc \not = 0$.

Riemann sphere provides us with a model of extended complex plane. This makes it in useful in complex analysis because it allows us to make sense of division by zero in some circumstance ($\frac{0}{0}$ is still undefined).

We start with the unit sphere $S^2$ in $\mathbb{R}^3$ ie. the spherical surface with center at origin and unit radius (set of points $(x,y,z)$ given by $x^2 +y^2 +z^2=1$). First, we shall show there is a bijection between the $R^2$ and $S^2$ minus a single point. (We identity $xy$-plane with $\mathbb{R}^2$)

Start with a point $P$ on $S^2$ other than the north pole $N = (0,0,1)$. Draw a ray starting from $N$ passing through $P$. This ray will intersect the $xy$-plane at a unique point $Q$. The map that takes $P$ to $Q$ is called as stereographic projection and is denoted as $\Pi$.

We shall use some properties of vectors to derive stereographic projection explicitly. $Q - N$ is parallel to $P - N$, so $P - N = k(Q - N)$ for some scalar $k$. Denote $P=(x,y,z)$ and $Q=(u,v,0)$. Then, we get $(x,y,z) = (ku,kv,1-k)$ so that $k=1-z$. Note $z \not = 1$ since $P \not = N$.

$$\therefore \Pi (x,y,z) = (u,v,0) = \left( \frac{x}{1-z}, \frac{y}{1-z}, 0 \right)$$

Since $x^2 + y^2 + z^2 = 1$, we have $k^2u^2 + k^2v^2 + (1-k)^2 = 1$. Simplifying (while noting $k \not = 0$ since $z \not = 1$),

$$k = \frac{2}{u^2+v^2 +1}$$

$$\therefore P = (x,y,z) = \left( \frac{2u}{u^2+v^2 +1}, \frac{2v}{u^2+v^2 +1}, \frac{u^2 +v^2 -1}{u^2+v^2 +1} \right)$$

It takes some simple calculations to verify that $\Pi : S^2 \setminus \{ N \} \to \mathbb{R}^2$ is a bijection and the map $ (u,v,0) \mapsto \left( \frac{2u}{u^2+v^2 +1}, \frac{2v}{u^2+v^2 +1}, \frac{u^2 +v^2 -1}{u^2+v^2 +1} \right)$ is its inverse.

Interact with GeoGebra applet to see how stereographic projection actually works.

In fact our map $\Pi$ is conformal (ie. it preserves angles) but proof of this assertion requires some tools from differential geometry.

We can extend $\Pi$ to the whole sphere $S^2$ by simply defining $\Pi (N) = \infty$. Here, $\infty$ is just a symbol. And similarly extend $\Pi^{-1} : \mathbb{R}^2 \cup \{ \infty \} \mapsto S^2$. Since, we can interpret $ \mathbb{C}$ as $\mathbb{R}^2$ (from a topological point of view), we can also define $\Pi^{-1} : \mathbb{C} \cup \{ \infty \} \to S^2$ as $$z \mapsto \left( \frac{2 \Re(z)}{|z|^2 +1}, \frac{2 \Im(z)}{|z|^2 +1}, \frac{|z|^2 -1}{|z|^2 +1} \right)$$

where $\Re(z)$ and $\Im(z)$ are real and imaginary part of $z$ respectively, and

$$\infty \mapsto (0,0,1)$$

PS: From a topological point of view, we can define a topology $\tau_1$ on $\mathbb{C} \cup \{\infty\}$ such that our usual topology $\tau_2$ on $\mathbb{C}$ is a subspace of $\tau_1$ and $\tau_1$ is one point compactification of $\tau_2$.

References:

A Pathway to Complex Analysis by S. Kumaresan
Topology by J. Munkres
A Comprehensive introduction to Differential Geometry Vol. 2 by M. Spivak

Wednesday, 21 June 2023

What is a matrix?

The following post is mostly aimed at high school students interested in mathematics (and even budding undergrads). We have learned in our schools that a matrix is a rectangular array of numbers. But is that all?

In physics we have seen vector quantities like force, displacement etc. A vector like $2i+4j+7k$ can also be represented by ordered triple $(2,4,7)$. We can generalize to ordered $n$-tuples and talk about vectors in $n^{th}$ dimensional space. We consider the set of all ordered $n$-tuples of real numbers $\mathbb{R}^n$ (each individual $n$-tuple will be referred as a vector). Let $X = \left( x_1, x_2,...,x_n \right)$, $Y = \left( y_1, y_2,...,y_n \right)$ and $k$ be a real number (also called a scalar). Then, $X+Y$ defined as the vector $\left( x_1 + y_1, x_2 + y_2,...,x_n + y_n \right)$ and $k \cdot X = \left( kx_1, kx_2,...,kx_n \right)$. The operations $+$ and $ \cdot $ are called vector addition and scalar multiplication respectively. We usually write $k \cdot X$ as $kX$. For $n=2$ and $3$, this coincides with the usual notion of vectors in classical physics and for $n=1$, it is the usual addition and multiplication of real numbers.

Some properties which we can easily verify are:

$X+Y = Y+X$
$X+(Y+Z) = (X+Y)+Z$
There exist $\textbf{0} \in \mathbb{R}^n$ such that for all $X \in \mathbb{R}^n$, $X +\textbf{0} = X$. Just choose $\textbf{0} = \left( 0, 0,...,0 \right)$.
For all $X \in \mathbb{R}^n$ there exist $X' \in \mathbb{R}^n$ such that $X+X'=\textbf{0}$. Take $X' = (-1)X$ and it is not hard to see that this is the only possible choice for $X'$.
$(k_1k_2) X = k_1 (k_2 X)$
$(k_1 + k_2) X = k_1 X + k_2 X$
$k (X +Y) = k X + k Y$
$0X = \textbf{0}$ and $1X = X$

It is easy to see that every vector $X$ in $\mathbb{R}^n$ can be represent uniquely in the form $\sum_{i=1}^{n} x_i E_i$ where $X = \left( x_1, x_2,...,x_n \right)$ and $E_i$ is the $n$-tuple whose $i^{th}$ entry is $1$ and $0$ elsewhere. These $E_i$'s are said to form a basis for $\mathbb{R}^n$. In case of confusion we shall denote them as $E_i^n$ to indicate the dimension of the space.

We say that a function $T: \mathbb{R}^n \to \mathbb{R}^m$ is linear if for all $X, Y \in \mathbb{R}^n$ and $k \in \mathbb{R}$, $$T(kX +Y) = kT(X) + T(Y)$$.

Can you think of some examples of linear maps. Since $T(X) = \sum_{i=1}^{n} x_i T(E_i)$, values at basis elements $E_i, i = 1,2,...,n$ determine the linear map. Morever, if you wish to define a linear map $T$ satisfying $T(E_i) = F_i, i=1,2,...,n$, define $T(X) = \sum_{i=1}^{n} x_i F_i$. It is not hard to verify that this map is well defined and linear. (Try this out! Read the paragraph on basis again). If $T: \mathbb{R}^n \to \mathbb{R}^m$ and $S: \mathbb{R}^m \to \mathbb{R}^l$ are maps, we can define the composition $S \circ T: \mathbb{R}^n \to \mathbb{R}^l$ as the map $S \circ T(X) = S(T(X))$. It is an easy exercise to verify that if $T$ and $S$ are linear, then $S \circ T$ is linear. (We usually write $ST$ for $S \circ T$). If $T_1: \mathbb{R}^n \to \mathbb{R}^m$ and $T_2: \mathbb{R}^n \to \mathbb{R}^m$ are maps, then we can define $T_1 + T_2:\mathbb{R}^n \to \mathbb{R}^m$ as $(T_1 + T_2)(X) = T_1(X) + T_2(X)$. If $T_1$ and $T_2$ are linear, so is $T_1 + T_2$.

In previous paragraph, we saw how values of a linear map $T$ at basis elements completely determine the map and how to construct a linear map with given values at basis elements (and this construction is unique by first statement). We can interpret elements of $R^n$ as column vectors ie. $n \times 1$ matrices. Let $T: \mathbb{R}^n \to \mathbb{R}^m$ is linear. Now, form a $m \times n$ matrix with its $i^{th}$ column being $T(E_i^n)$ (since $T(E_i^n) \in \mathbb{R}^m$, it is interpreted as an $m \times 1$ matrix). We denote this matrix as $\left[T \right]$. Procedure for constructing a linear map with given values at basis elements tells is that every $m \times n$ matrix is of the form $\left[T \right]$ for some linear map $T: \mathbb{R}^n \to \mathbb{R}^m$. So there is a bijective (one to one and onto) correspondence between set of all linear functions from $\mathbb{R}^n$ to $\mathbb{R}^m$ (denoted as $\mathcal{L} \left( \mathbb{R}^n , \mathbb{R}^m \right)$) and set of all $m \times n$ matrices (denoted as $\mathcal{M}_{m \times n} (\mathbb{R})$).

Let $A$ be an $m \times n$ matrix. We denote the $(i,j)^{th}$ entry (ie. number in $i^{th}$ row and $j^{th}$ column) of $A$ by $A_{ij}$. Recall that if $A$ is an $m \times n$ and $B$ is an $n \times p$ matrix, then we can define the matrix product $AB$ by $(AB)_{ij} = \sum_{k=1}^{n} A_{ik}B_{kj}$. Usual properties like commutativity need not hold ie. It is not always true that $AB$ = $BA$. However associativity still holds ie. If $A$ is an $m \times n$, $B$ is an $n \times p$ and $C$ is an $p \times q$ matrix, then $A(BC) = (AB)C$ still holds. (It requires little work). Distributive property is also valid ie. $A(B+C) = AB +AC$ and $(A+B)C = AC + BC$, whenever they are defined. (In addition of matrices, we add entrywise ie. $(A+B)_{ij} = A_{ij} + B_{ij}$).

Let us look at the matrix $\left[T \right]$ in more detail. Denote $\left[T \right]_{ij}$ by $t_{ij}$. From our above discussion, we see that $T(E_j^n) = \sum_{i=1}^{m} t_{ij}E_i^m$. Let $X \in \mathbb{R}^n$ as given previously. We shall compute $\left[T \right]X$ (the product will give a $m \times 1$ matrix). $$(\left[T \right]X)_{i1} = \sum_{k=1}^{n} t_{ik}X_{k1} = \sum_{k=1}^{n} t_{ik}x_k$$

Now, $T(X) = \sum_{k=1}^{n} x_k T(E_k^n) = \sum_{k=1}^{n} x_k \left( \sum_{i=1}^{m} t_{ik}E_i^m \right) = \sum_{i=1}^{m} \left( \sum_{k=1}^{n} t_{ik}x_k \right) E_i^m$

$$\therefore T(X) = \sum_{i=1}^{m}( \left[T \right]X)_{i1}E_i^m$$

Thus $\left[T \right]X$ gives $T(X)$ in column vector form.

It is easy to verify that $\left[ T_1 + T_2 \right] = \left[T_1 \right] + \left[T_1 \right]$. But it takes slightly more work to show that $ \left[ST \right] = \left[S \right] \left[T \right]$ (take it as a challenging exercise, it is similar to the calculation we have done above). Note that the identity function $id_n:\mathbb{R}^n \to \mathbb{R}^n \in \mathcal{L} \left( \mathbb{R}^n , \mathbb{R}^n \right)$ and $\left[id_n \right] $ is the $n \times n$ identity matrix $I_n$. The zero map $\mathbb{0}:\mathbb{R}^n \to \mathbb{R}^m$ given by $\mathbb{0}(X) = \textbf{0}$ is linear and unsurprisingly $\left[ 0 \right]$ is the $m \times n$ matrix with all entries $0$.

We have seen there is a deep connection between linear maps and matrices. The correspondence $T \mapsto \left[T \right]$ is a bijection $\mathcal{L} \left( \mathbb{R}^n , \mathbb{R}^m \right) \to \mathcal{M}_{m \times n} (\mathbb{R})$ that preserves structure (the technical term is isomorphism). In conclusion, matrices are precisely linear maps.

PS: We worked with set of real numbers $\mathbb{R}$. We could also work with $\mathbb{Q}$ or $\mathbb{C}$ (ie. set of rationals and complex numbers respectively). Instead of $\mathbb{R}^n$ or $\mathbb{C}^n$, the same reasoning applies when we work with algebraic structures called finite dimensional vector spaces. Pick up a book on linear algebra to learn more. Hope this was a good motivation to study matrices.

References:

Linear Algebra $4^{th}$ ed. by Friedberg, Insel and Spence
Linear Algebra Done Right $3^{rd}$ ed. by Axler
What Is Mathematics? by Courant and Robbins