Processing math: 100%

Thursday, 29 June 2023

The Nine Point Circle

The nine point circle (also known as Feuerbach's circle) is one of the most interesting topics in elementary geometry. It doesn't require much prerequisites and so this post can be read by a high school student who has some idea about geometry of circle and triangle.

Just a reminder: All three altitudes of a triangle meet at a single point, which we call orthocenter and a set of points is said to be concyclic if they all lie in a circle.






Consider \triangle ABC and let O be the orthocenter (shown in green). Let A_a be the feet of altitude from vertex A on side BC, A_m be the midpoint of side BC and A_o be the midpoint of line segment AO (we have used the colors red, blue and orange respectively for these points). Similar notations corresponding to vertex B and C. Then the nine points (which sometimes may not be distinct)  A_o, B_o, C_o (midpoints between orthocenter and vertices) A_m, B_m, C_m (midpoints of sides) and A_a, B_a, C_a (feet of altitudes) lie in a same circle which is commonly called the nine point circle (shown with dotted circle).

Try interacting with the GeoGebra applet given below (\Omega is the center of nine point circle):


To prove this we need some elementary results:

  • A convex quadrilateral is cyclic (ie. its vertices are concyclic) if and only if its opposite angles are supplementary (ie. 180^{\circ}). One half of this result is Euclid's proposition 22 in Book III.
  • Line segment formed by connecting the midpoints of two sides of a triangle will be parallel to the third side and have half of its length. This called midpoint theorem.
  • The center of the circumcircle of a right triangle lies on its hypotenuse. This is converse of a famous result called Thales' theorem.
Let us start by constructing line segments C_mB_m, B_oC_o, C_mB_o and C_oB_m.




Now in \triangle OBC, we use midpoint theorem to conclude B_oC_o is parallel to BC. Similarly in \triangle ABC we get C_mB_m is parallel to BC. Similarly by considering \triangle AOC and \triangle AOB, we get C_mB_o and C_oB_m are parallel (since both are parallel to AO by midpoint theorem). Since line AO (ie line AA_a is perpendicular to BC, we get C_oB_m is perpendicular to C_mB_m. So, \square C_mB_mC_oB_o is a rectangle and in particular is a cyclic quadrilateral. Moreover, by converse of Thales' theorem, C_mC_o and B_mB_o are the diameters of the circle passing through vertices of \square C_mB_mC_oB_o. Again by converse of Thales' theorem, C_a and B_a also lie on this circle.

We have proved that C_m,B_m,C_o,B_o,C_a, B_a lie on a circle with diameters C_mC_o and B_mB_o. Similarly, we can show A_m,B_m,A_o,B_o,A_a, B_a lie on a circle with diameters A_mA_o and B_mB_o and A_m,C_m,A_o,C_o,A_a, C_a lie on a circle with diameters A_mA_o and C_mC_o. Looking at the diameters of these three circles, we see that any two of them has a common diameter. So these three circles must coincide and is indeed the required nine point circle.

References:
  •  A Sequel to the First Six Books of Euclid by John Casey
  • David Joyce's web version of Euclid's Elements



Sunday, 25 June 2023

Riemann Sphere and Stereographic Projection

Sometimes while studying complex valued functions, it is sometimes useful to consider the extended complex plane \mathbb{C} \cup \{ \infty \}, where we append an additional point \infty to the set of usual complex numbers. This is especially helpful in study of Mobius maps z \mapsto \frac{az+b}{cz+d} where ad-bc \not = 0.

Riemann sphere provides us with a model of extended complex plane. This makes it in useful in complex analysis because it allows us to make sense of division by zero in some circumstance (\frac{0}{0} is still undefined).

We start with the unit sphere S^2 in \mathbb{R}^3 ie. the spherical surface with center at origin and unit radius (set of points (x,y,z) given by x^2 +y^2 +z^2=1). First, we shall show there is a bijection between the R^2 and S^2 minus a single point. (We identity xy-plane with \mathbb{R}^2)

Start with a point P on S^2 other than the north pole N = (0,0,1). Draw a ray starting from N passing through P. This ray will intersect the xy-plane at a unique point Q. The map that takes P to Q is called as stereographic projection and is denoted as \Pi.


We shall use some properties of vectors to derive stereographic projection explicitly. Q - N is parallel to P - N, so P - N = k(Q - N) for some scalar k. Denote P=(x,y,z) and Q=(u,v,0). Then, we get (x,y,z) = (ku,kv,1-k) so that k=1-z. Note z \not = 1 since P \not = N.

\therefore \Pi (x,y,z) = (u,v,0) = \left( \frac{x}{1-z}, \frac{y}{1-z}, 0 \right)

Since x^2 + y^2 + z^2 = 1, we have k^2u^2 + k^2v^2 + (1-k)^2 = 1. Simplifying (while noting k \not = 0 since z \not = 1),

k = \frac{2}{u^2+v^2 +1}

\therefore P = (x,y,z) = \left(  \frac{2u}{u^2+v^2 +1}, \frac{2v}{u^2+v^2 +1}, \frac{u^2 +v^2 -1}{u^2+v^2 +1} \right)

It takes some simple calculations to verify that \Pi : S^2 \setminus \{ N \} \to \mathbb{R}^2 is a bijection and the map (u,v,0) \mapsto \left(  \frac{2u}{u^2+v^2 +1}, \frac{2v}{u^2+v^2 +1}, \frac{u^2 +v^2 -1}{u^2+v^2 +1} \right) is its inverse.


Interact with GeoGebra applet to see how stereographic projection actually works.

In fact our map \Pi is conformal (ie. it preserves angles) but proof of this assertion requires some tools from differential geometry.

We can extend \Pi to the whole sphere S^2 by simply defining \Pi (N) = \infty. Here, \infty is just a symbol. And similarly extend \Pi^{-1} :  \mathbb{R}^2 \cup \{ \infty \} \mapsto S^2. Since, we can interpret \mathbb{C} as  \mathbb{R}^2 (from a topological point of view), we can also define \Pi^{-1} :  \mathbb{C} \cup \{ \infty \} \to S^2 as z \mapsto \left(  \frac{2 \Re(z)}{|z|^2 +1}, \frac{2 \Im(z)}{|z|^2 +1}, \frac{|z|^2 -1}{|z|^2 +1} \right)

where \Re(z) and \Im(z) are real and imaginary part of z respectively, and

\infty \mapsto (0,0,1)

PS: From a topological point of view, we can define a topology \tau_1 on \mathbb{C} \cup \{\infty\} such that our usual topology \tau_2 on \mathbb{C} is a subspace of  \tau_1 and \tau_1 is one point compactification of \tau_2.

References:

  • A Pathway to Complex Analysis by S. Kumaresan
  • Topology by J. Munkres
  • A Comprehensive introduction to Differential Geometry Vol. 2 by M. Spivak










Wednesday, 21 June 2023

What is a matrix?

The following post is mostly aimed at high school students interested in mathematics (and even budding undergrads). We have learned in our schools that a matrix is a rectangular array of numbers. But is that all?

In physics we have seen vector quantities like force, displacement etc. A vector like 2i+4j+7k can also be represented by ordered triple (2,4,7). We can generalize to ordered n-tuples and talk about vectors in n^{th} dimensional space. We consider the set of all ordered n-tuples of real numbers \mathbb{R}^n (each individual n-tuple will be referred as a vector). Let X = \left( x_1, x_2,...,x_n \right), Y = \left( y_1, y_2,...,y_n \right) and k be a real number (also called a scalar). Then, X+Y defined as the vector \left( x_1 + y_1, x_2 + y_2,...,x_n + y_n \right) and k \cdot X = \left( kx_1, kx_2,...,kx_n \right). The operations + and \cdot are called vector addition and scalar multiplication respectively. We usually write k \cdot X as kX. For n=2 and 3, this coincides with the usual notion of vectors in classical physics and for n=1, it is the usual addition and multiplication of real numbers.

Some properties which we can easily verify are:
  • X+Y = Y+X
  • X+(Y+Z) = (X+Y)+Z
  • There exist \textbf{0} \in \mathbb{R}^n such that for all X \in \mathbb{R}^n, X +\textbf{0} = X. Just choose \textbf{0} = \left( 0, 0,...,0 \right).
  • For all X \in \mathbb{R}^n there exist X' \in  \mathbb{R}^n such that X+X'=\textbf{0}. Take X' = (-1)X and it is not hard to see that this is the only possible choice for X'.
  • (k_1k_2) X = k_1 (k_2  X)
  • (k_1 + k_2)  X = k_1  X + k_2 X
  • k (X +Y) = k  X + k  Y
  • 0X = \textbf{0} and 1X = X
It is easy to see that every vector X in \mathbb{R}^n can be represent uniquely in the form \sum_{i=1}^{n} x_i E_i where X = \left( x_1, x_2,...,x_n \right) and E_i is the n-tuple whose i^{th} entry is 1 and 0 elsewhere. These E_i's are said to form a basis for \mathbb{R}^n. In case of confusion we shall denote them as E_i^n to indicate the dimension of the space.

We say that a function T: \mathbb{R}^n \to \mathbb{R}^m is linear if for all X, Y \in \mathbb{R}^n and k \in \mathbb{R}, T(kX +Y) = kT(X) + T(Y)
.
Can you think of some examples of linear maps. Since T(X) = \sum_{i=1}^{n} x_i T(E_i), values at basis elements E_i, i = 1,2,...,n determine the linear map. Morever, if you wish to define a linear map T satisfying T(E_i) = F_i, i=1,2,...,n, define T(X) =  \sum_{i=1}^{n} x_i F_i. It is not hard to verify that this map is well defined and linear. (Try this out! Read the paragraph on basis again). If T: \mathbb{R}^n \to \mathbb{R}^m and S: \mathbb{R}^m \to \mathbb{R}^l are maps, we can define the composition S \circ T: \mathbb{R}^n \to \mathbb{R}^l as the map S \circ T(X) = S(T(X)). It is an easy exercise to verify that if T and S are linear, then S \circ T is linear. (We usually write ST for S \circ T). If  T_1: \mathbb{R}^n \to \mathbb{R}^m and T_2: \mathbb{R}^n \to \mathbb{R}^m are maps, then we can define T_1 + T_2:\mathbb{R}^n \to \mathbb{R}^m as (T_1 + T_2)(X) = T_1(X) + T_2(X). If T_1 and T_2 are linear, so is T_1 + T_2.

In previous paragraph, we saw how values of a linear map T at basis elements completely determine the map and how to construct a linear map with given values at basis elements (and this construction is unique by first statement). We can interpret elements of R^n as column vectors ie. n \times 1 matrices. Let  T: \mathbb{R}^n \to \mathbb{R}^m is linear. Now, form a m \times n matrix with its i^{th} column being T(E_i^n) (since T(E_i^n) \in \mathbb{R}^m, it is interpreted as an m \times 1 matrix). We denote this matrix as \left[T \right]. Procedure for constructing a linear map with given values at basis elements tells is that every m \times n matrix is of the form \left[T \right] for some linear map T: \mathbb{R}^n \to \mathbb{R}^m. So there is a bijective (one to one and onto) correspondence between set of all linear functions from \mathbb{R}^n to \mathbb{R}^m (denoted as \mathcal{L} \left( \mathbb{R}^n , \mathbb{R}^m \right)) and set of all m \times n matrices (denoted as \mathcal{M}_{m \times n} (\mathbb{R})).

Let A be an m \times n matrix. We denote the (i,j)^{th} entry (ie. number in i^{th} row and j^{th} column) of A by A_{ij}. Recall that if A is an m \times n and B is an n \times p matrix, then we can define the matrix product AB by (AB)_{ij} = \sum_{k=1}^{n} A_{ik}B_{kj}. Usual properties like commutativity need not hold ie. It is not always true that AB = BA. However associativity still holds ie. If  A is an m \times n, B is an n \times p and C is an p \times q matrix, then A(BC) = (AB)C still holds. (It requires little work). Distributive property is also valid ie. A(B+C) = AB +AC and (A+B)C = AC + BC, whenever they are defined. (In addition of matrices, we add entrywise ie. (A+B)_{ij} = A_{ij} + B_{ij}).

Let us look at the matrix \left[T \right] in more detail. Denote \left[T \right]_{ij} by t_{ij}. From our above discussion, we see that T(E_j^n) = \sum_{i=1}^{m} t_{ij}E_i^m. Let X \in \mathbb{R}^n as given previously. We shall compute \left[T \right]X (the product will give a m \times 1 matrix). (\left[T \right]X)_{i1} =  \sum_{k=1}^{n} t_{ik}X_{k1} = \sum_{k=1}^{n} t_{ik}x_k
Now, T(X) = \sum_{k=1}^{n} x_k T(E_k^n) = \sum_{k=1}^{n} x_k \left( \sum_{i=1}^{m} t_{ik}E_i^m \right) =  \sum_{i=1}^{m} \left( \sum_{k=1}^{n} t_{ik}x_k \right) E_i^m
\therefore T(X) = \sum_{i=1}^{m}( \left[T \right]X)_{i1}E_i^m
Thus \left[T \right]X gives T(X) in column vector form.

It is easy to verify that \left[ T_1 + T_2 \right] =  \left[T_1 \right] +  \left[T_1 \right]. But it takes slightly more work to show that \left[ST \right] =  \left[S \right] \left[T \right] (take it as a challenging exercise, it is similar to the calculation we have done above). Note that the identity function id_n:\mathbb{R}^n \to \mathbb{R}^n \in \mathcal{L} \left( \mathbb{R}^n , \mathbb{R}^n \right) and \left[id_n \right] is the n \times n identity matrix I_n. The zero map \mathbb{0}:\mathbb{R}^n \to \mathbb{R}^m given by \mathbb{0}(X) = \textbf{0} is linear and unsurprisingly \left[ 0 \right] is the m \times n matrix with all entries 0.

We have seen there is a deep connection between linear maps and matrices. The correspondence T \mapsto \left[T \right] is a bijection \mathcal{L} \left( \mathbb{R}^n , \mathbb{R}^m \right) \to \mathcal{M}_{m \times n} (\mathbb{R}) that preserves structure (the technical term is isomorphism). In conclusion, matrices are precisely linear maps.

PS: We worked with set of real numbers \mathbb{R}. We could also work with \mathbb{Q} or \mathbb{C} (ie. set of rationals and complex numbers respectively). Instead of \mathbb{R}^n or \mathbb{C}^n, the same reasoning applies when we work with algebraic structures called finite dimensional vector spaces. Pick up a book on linear algebra to learn more. Hope this was a good motivation to study matrices.

References:

  • Linear Algebra 4^{th} ed. by Friedberg, Insel and Spence
  • Linear Algebra Done Right 3^{rd} ed. by Axler
  • What Is Mathematics? by Courant and Robbins




Mathematical Biscuit I

 Can you find two irrational numbers a,b such that a^b is rational? Surprisingly, yes and the argument is very easy.  If $\sqrt{2}^{\sq...