Thursday, 29 June 2023

The Nine Point Circle

The nine point circle (also known as Feuerbach's circle) is one of the most interesting topics in elementary geometry. It doesn't require much prerequisites and so this post can be read by a high school student who has some idea about geometry of circle and triangle.

Just a reminder: All three altitudes of a triangle meet at a single point, which we call orthocenter and a set of points is said to be concyclic if they all lie in a circle.






Consider $\triangle ABC$ and let $O$ be the orthocenter (shown in green). Let $A_a$ be the feet of altitude from vertex $A$ on side $BC$, $A_m$ be the midpoint of side $BC$ and $A_o$ be the midpoint of line segment $AO$ (we have used the colors red, blue and orange respectively for these points). Similar notations corresponding to vertex $B$ and $C$. Then the nine points (which sometimes may not be distinct)  $A_o, B_o, C_o$ (midpoints between orthocenter and vertices) $A_m, B_m, C_m$ (midpoints of sides) and $A_a, B_a, C_a$ (feet of altitudes) lie in a same circle which is commonly called the nine point circle (shown with dotted circle).

Try interacting with the GeoGebra applet given below ($\Omega$ is the center of nine point circle):


To prove this we need some elementary results:

  • A convex quadrilateral is cyclic (ie. its vertices are concyclic) if and only if its opposite angles are supplementary (ie. $180^{\circ}$). One half of this result is Euclid's proposition $22$ in Book III.
  • Line segment formed by connecting the midpoints of two sides of a triangle will be parallel to the third side and have half of its length. This called midpoint theorem.
  • The center of the circumcircle of a right triangle lies on its hypotenuse. This is converse of a famous result called Thales' theorem.
Let us start by constructing line segments $C_mB_m$, $B_oC_o$, $C_mB_o$ and $C_oB_m$.




Now in $\triangle OBC$, we use midpoint theorem to conclude $B_oC_o$ is parallel to $BC$. Similarly in $\triangle ABC$ we get $C_mB_m$ is parallel to $BC$. Similarly by considering $\triangle AOC$ and $\triangle AOB$, we get $C_mB_o$ and $C_oB_m$ are parallel (since both are parallel to $AO$ by midpoint theorem). Since line $AO$ (ie line $AA_a$ is perpendicular to $BC$, we get $C_oB_m$ is perpendicular to $C_mB_m$. So, $\square C_mB_mC_oB_o$ is a rectangle and in particular is a cyclic quadrilateral. Moreover, by converse of Thales' theorem, $C_mC_o$ and $B_mB_o$ are the diameters of the circle passing through vertices of $\square C_mB_mC_oB_o$. Again by converse of Thales' theorem, $C_a$ and $B_a$ also lie on this circle.

We have proved that $C_m,B_m,C_o,B_o,C_a, B_a$ lie on a circle with diameters $C_mC_o$ and $B_mB_o$. Similarly, we can show $A_m,B_m,A_o,B_o,A_a, B_a$ lie on a circle with diameters $A_mA_o$ and $B_mB_o$ and $A_m,C_m,A_o,C_o,A_a, C_a$ lie on a circle with diameters $A_mA_o$ and $C_mC_o$. Looking at the diameters of these three circles, we see that any two of them has a common diameter. So these three circles must coincide and is indeed the required nine point circle.

References:
  •  A Sequel to the First Six Books of Euclid by John Casey
  • David Joyce's web version of Euclid's Elements



Sunday, 25 June 2023

Riemann Sphere and Stereographic Projection

Sometimes while studying complex valued functions, it is sometimes useful to consider the extended complex plane $\mathbb{C} \cup \{ \infty \}$, where we append an additional point $\infty$ to the set of usual complex numbers. This is especially helpful in study of Mobius maps $z \mapsto \frac{az+b}{cz+d}$ where $ad-bc \not = 0$.

Riemann sphere provides us with a model of extended complex plane. This makes it in useful in complex analysis because it allows us to make sense of division by zero in some circumstance ($\frac{0}{0}$ is still undefined).

We start with the unit sphere $S^2$ in $\mathbb{R}^3$ ie. the spherical surface with center at origin and unit radius (set of points $(x,y,z)$ given by $x^2 +y^2 +z^2=1$). First, we shall show there is a bijection between the $R^2$ and $S^2$ minus a single point. (We identity $xy$-plane with $\mathbb{R}^2$)

Start with a point $P$ on $S^2$ other than the north pole $N = (0,0,1)$. Draw a ray starting from $N$ passing through $P$. This ray will intersect the $xy$-plane at a unique point $Q$. The map that takes $P$ to $Q$ is called as stereographic projection and is denoted as $\Pi$.


We shall use some properties of vectors to derive stereographic projection explicitly. $Q - N$ is parallel to $P - N$, so $P - N = k(Q - N)$ for some scalar $k$. Denote $P=(x,y,z)$ and $Q=(u,v,0)$. Then, we get $(x,y,z) = (ku,kv,1-k)$ so that $k=1-z$. Note $z \not = 1$ since $P \not = N$.

$$\therefore \Pi (x,y,z) = (u,v,0) = \left( \frac{x}{1-z}, \frac{y}{1-z}, 0 \right)$$

Since $x^2 + y^2 + z^2 = 1$, we have $k^2u^2 + k^2v^2 + (1-k)^2 = 1$. Simplifying (while noting $k \not = 0$ since $z \not = 1$),

$$k = \frac{2}{u^2+v^2 +1}$$

$$\therefore P = (x,y,z) = \left(  \frac{2u}{u^2+v^2 +1}, \frac{2v}{u^2+v^2 +1}, \frac{u^2 +v^2 -1}{u^2+v^2 +1} \right)$$

It takes some simple calculations to verify that $\Pi : S^2 \setminus \{ N \} \to \mathbb{R}^2$ is a bijection and the map $ (u,v,0) \mapsto \left(  \frac{2u}{u^2+v^2 +1}, \frac{2v}{u^2+v^2 +1}, \frac{u^2 +v^2 -1}{u^2+v^2 +1} \right)$ is its inverse.


Interact with GeoGebra applet to see how stereographic projection actually works.

In fact our map $\Pi$ is conformal (ie. it preserves angles) but proof of this assertion requires some tools from differential geometry.

We can extend $\Pi$ to the whole sphere $S^2$ by simply defining $\Pi (N) = \infty$. Here, $\infty$ is just a symbol. And similarly extend $\Pi^{-1} :  \mathbb{R}^2 \cup \{ \infty \} \mapsto S^2$. Since, we can interpret $ \mathbb{C}$ as  $\mathbb{R}^2$ (from a topological point of view), we can also define $\Pi^{-1} :  \mathbb{C} \cup \{ \infty \} \to S^2$ as $$z \mapsto \left(  \frac{2 \Re(z)}{|z|^2 +1}, \frac{2 \Im(z)}{|z|^2 +1}, \frac{|z|^2 -1}{|z|^2 +1} \right)$$

where $\Re(z)$ and $\Im(z)$ are real and imaginary part of $z$ respectively, and

$$\infty \mapsto (0,0,1)$$

PS: From a topological point of view, we can define a topology $\tau_1$ on $\mathbb{C} \cup \{\infty\}$ such that our usual topology $\tau_2$ on $\mathbb{C}$ is a subspace of  $\tau_1$ and $\tau_1$ is one point compactification of $\tau_2$.

References:

  • A Pathway to Complex Analysis by S. Kumaresan
  • Topology by J. Munkres
  • A Comprehensive introduction to Differential Geometry Vol. 2 by M. Spivak










Wednesday, 21 June 2023

What is a matrix?

The following post is mostly aimed at high school students interested in mathematics (and even budding undergrads). We have learned in our schools that a matrix is a rectangular array of numbers. But is that all?

In physics we have seen vector quantities like force, displacement etc. A vector like $2i+4j+7k$ can also be represented by ordered triple $(2,4,7)$. We can generalize to ordered $n$-tuples and talk about vectors in $n^{th}$ dimensional space. We consider the set of all ordered $n$-tuples of real numbers $\mathbb{R}^n$ (each individual $n$-tuple will be referred as a vector). Let $X = \left( x_1, x_2,...,x_n \right)$, $Y = \left( y_1, y_2,...,y_n \right)$ and $k$ be a real number (also called a scalar). Then, $X+Y$ defined as the vector $\left( x_1 + y_1, x_2 + y_2,...,x_n + y_n \right)$ and $k \cdot X = \left( kx_1, kx_2,...,kx_n \right)$. The operations $+$ and $ \cdot $ are called vector addition and scalar multiplication respectively. We usually write $k \cdot X$ as $kX$. For $n=2$ and $3$, this coincides with the usual notion of vectors in classical physics and for $n=1$, it is the usual addition and multiplication of real numbers.

Some properties which we can easily verify are:
  • $X+Y = Y+X$
  • $X+(Y+Z) = (X+Y)+Z$
  • There exist $\textbf{0} \in \mathbb{R}^n$ such that for all $X \in \mathbb{R}^n$, $X +\textbf{0} = X$. Just choose $\textbf{0} = \left( 0, 0,...,0 \right)$.
  • For all $X \in \mathbb{R}^n$ there exist $X' \in  \mathbb{R}^n$ such that $X+X'=\textbf{0}$. Take $X' = (-1)X$ and it is not hard to see that this is the only possible choice for $X'$.
  • $(k_1k_2) X = k_1 (k_2  X)$
  • $(k_1 + k_2)  X = k_1  X + k_2 X$
  • $k (X +Y) = k  X + k  Y$
  • $0X = \textbf{0}$ and $1X = X$
It is easy to see that every vector $X$ in $\mathbb{R}^n$ can be represent uniquely in the form $\sum_{i=1}^{n} x_i E_i$ where $X = \left( x_1, x_2,...,x_n \right)$ and $E_i$ is the $n$-tuple whose $i^{th}$ entry is $1$ and $0$ elsewhere. These $E_i$'s are said to form a basis for $\mathbb{R}^n$. In case of confusion we shall denote them as $E_i^n$ to indicate the dimension of the space.

We say that a function $T: \mathbb{R}^n \to \mathbb{R}^m$ is linear if for all $X, Y \in \mathbb{R}^n$ and $k \in \mathbb{R}$, $$T(kX +Y) = kT(X) + T(Y)$$.
Can you think of some examples of linear maps. Since $T(X) = \sum_{i=1}^{n} x_i T(E_i)$, values at basis elements $E_i, i = 1,2,...,n$ determine the linear map. Morever, if you wish to define a linear map $T$ satisfying $T(E_i) = F_i, i=1,2,...,n$, define $T(X) =  \sum_{i=1}^{n} x_i F_i$. It is not hard to verify that this map is well defined and linear. (Try this out! Read the paragraph on basis again). If $T: \mathbb{R}^n \to \mathbb{R}^m$ and $S: \mathbb{R}^m \to \mathbb{R}^l$ are maps, we can define the composition $S \circ T: \mathbb{R}^n \to \mathbb{R}^l$ as the map $S \circ T(X) = S(T(X))$. It is an easy exercise to verify that if $T$ and $S$ are linear, then $S \circ T$ is linear. (We usually write $ST$ for $S \circ T$). If  $T_1: \mathbb{R}^n \to \mathbb{R}^m$ and $T_2: \mathbb{R}^n \to \mathbb{R}^m$ are maps, then we can define $T_1 + T_2:\mathbb{R}^n \to \mathbb{R}^m$ as $(T_1 + T_2)(X) = T_1(X) + T_2(X)$. If $T_1$ and $T_2$ are linear, so is $T_1 + T_2$.

In previous paragraph, we saw how values of a linear map $T$ at basis elements completely determine the map and how to construct a linear map with given values at basis elements (and this construction is unique by first statement). We can interpret elements of $R^n$ as column vectors ie. $n \times 1$ matrices. Let  $T: \mathbb{R}^n \to \mathbb{R}^m$ is linear. Now, form a $m \times n$ matrix with its $i^{th}$ column being $T(E_i^n)$ (since $T(E_i^n) \in \mathbb{R}^m$, it is interpreted as an $m \times 1$ matrix). We denote this matrix as $\left[T \right]$. Procedure for constructing a linear map with given values at basis elements tells is that every $m \times n$ matrix is of the form $\left[T \right]$ for some linear map $T: \mathbb{R}^n \to \mathbb{R}^m$. So there is a bijective (one to one and onto) correspondence between set of all linear functions from $\mathbb{R}^n$ to $\mathbb{R}^m$ (denoted as $\mathcal{L} \left( \mathbb{R}^n , \mathbb{R}^m \right)$) and set of all $m \times n$ matrices (denoted as $\mathcal{M}_{m \times n} (\mathbb{R})$).

Let $A$ be an $m \times n$ matrix. We denote the $(i,j)^{th}$ entry (ie. number in $i^{th}$ row and $j^{th}$ column) of $A$ by $A_{ij}$. Recall that if $A$ is an $m \times n$ and $B$ is an $n \times p$ matrix, then we can define the matrix product $AB$ by $(AB)_{ij} = \sum_{k=1}^{n} A_{ik}B_{kj}$. Usual properties like commutativity need not hold ie. It is not always true that $AB$ = $BA$. However associativity still holds ie. If  $A$ is an $m \times n$, $B$ is an $n \times p$ and $C$ is an $p \times q$ matrix, then $A(BC) = (AB)C$ still holds. (It requires little work). Distributive property is also valid ie. $A(B+C) = AB +AC$ and $(A+B)C = AC + BC$, whenever they are defined. (In addition of matrices, we add entrywise ie. $(A+B)_{ij} = A_{ij} + B_{ij}$).

Let us look at the matrix $\left[T \right]$ in more detail. Denote $\left[T \right]_{ij}$ by $t_{ij}$. From our above discussion, we see that $T(E_j^n) = \sum_{i=1}^{m} t_{ij}E_i^m$. Let $X \in \mathbb{R}^n$ as given previously. We shall compute $\left[T \right]X$ (the product will give a $m \times 1$ matrix). $$(\left[T \right]X)_{i1} =  \sum_{k=1}^{n} t_{ik}X_{k1} = \sum_{k=1}^{n} t_{ik}x_k$$
Now, $T(X) = \sum_{k=1}^{n} x_k T(E_k^n) = \sum_{k=1}^{n} x_k \left( \sum_{i=1}^{m} t_{ik}E_i^m \right) =  \sum_{i=1}^{m} \left( \sum_{k=1}^{n} t_{ik}x_k \right) E_i^m$
$$\therefore T(X) = \sum_{i=1}^{m}( \left[T \right]X)_{i1}E_i^m$$
Thus $\left[T \right]X$ gives $T(X)$ in column vector form.

It is easy to verify that $\left[ T_1 + T_2 \right] =  \left[T_1 \right] +  \left[T_1 \right]$. But it takes slightly more work to show that $ \left[ST \right] =  \left[S \right] \left[T \right]$ (take it as a challenging exercise, it is similar to the calculation we have done above). Note that the identity function $id_n:\mathbb{R}^n \to \mathbb{R}^n \in \mathcal{L} \left( \mathbb{R}^n , \mathbb{R}^n \right)$ and $\left[id_n \right] $ is the $n \times n$ identity matrix $I_n$. The zero map $\mathbb{0}:\mathbb{R}^n \to \mathbb{R}^m$ given by $\mathbb{0}(X) = \textbf{0}$ is linear and unsurprisingly $\left[ 0 \right]$ is the $m \times n$ matrix with all entries $0$.

We have seen there is a deep connection between linear maps and matrices. The correspondence $T \mapsto \left[T \right]$ is a bijection $\mathcal{L} \left( \mathbb{R}^n , \mathbb{R}^m \right) \to \mathcal{M}_{m \times n} (\mathbb{R})$ that preserves structure (the technical term is isomorphism). In conclusion, matrices are precisely linear maps.

PS: We worked with set of real numbers $\mathbb{R}$. We could also work with $\mathbb{Q}$ or $\mathbb{C}$ (ie. set of rationals and complex numbers respectively). Instead of $\mathbb{R}^n$ or $\mathbb{C}^n$, the same reasoning applies when we work with algebraic structures called finite dimensional vector spaces. Pick up a book on linear algebra to learn more. Hope this was a good motivation to study matrices.

References:

  • Linear Algebra $4^{th}$ ed. by Friedberg, Insel and Spence
  • Linear Algebra Done Right $3^{rd}$ ed. by Axler
  • What Is Mathematics? by Courant and Robbins




Mathematical Biscuit I

 Can you find two irrational numbers $a,b$ such that $a^b$ is rational? Surprisingly, yes and the argument is very easy.  If $\sqrt{2}^{\sq...