Sandy's Personal Website
personal musings

Topology of Numbers - III

(Introduction to continued fractions. Chapter 2 of Topology of Numbers.)

\[\frac{p}{q} = a_0 + \cfrac{1}{a_1 + \cfrac{1}{a_2 + ... \cfrac{1}{a_n}}}\]

A concise way to write this is \(\frac{p}{q} = a_0 + 1\nearrow a_1 + 1\nearrow a_2 + ... + 1\nearrow a_n = [a_0,a_1,\cdots,a_n]\)

The Euclidean Algorithm

The process for computing the continued fraction representation of a given rational number uses the Euclidean algorithm.

Here is an example, \( 19/7 \)

\[\begin{align*} 19 &= \color{red}{2}\cdot7 + 5 \newline 7 &= \color{red}{1}\cdot5 + 2 \newline 5 &= \color{red}{2}\cdot2 + \color{blue}{1} \newline 2 &= \color{red}{2}\cdot1 + 0 \newline \end{align*}\]

Therefore, \( 19/7 = \color{red}{2} + 1\nearrow\color{red}{1} + 1\nearrow \color{red}{2} + 1\nearrow \color{red}{2} \)

The Euclidean algorithm can also be used to find the greatest common divisor of two numbers. If one divides the larger number by the smaller, then the last non-zero remainder is the greatest common divisor. Here, \( \text{gcd}(19,7) = \color{blue}{1} \).

The idea behind finding the gcd is as follows: \( a = q\cdot b + r \), then any common divisor of \(a\) and \(b\), will also be a divisor of \(r\). Then, \(gcd(a,b) = gcd(b,r) \). Then, we can use this process recursively with the break condition that \(gcd(c,0) = c\). For a rational, this process must terminate as \(r < b\).

Note that Euclidean algorithm is more convenient than using the prime factorization to find the gcd.

How are Farey diagrams related to continued fractions? Firstly, define the convergents of a continued fraction as the number obtained by summing the first \(k \) elements in the continued fraction representation. The last fraction is the number itself, whose continued fraction representation we are studying.

E.g. \(19/7 = [2,1,2,2]\), the convergents are \( 2,3,8/3,19/7 \).

Let’s study a geometric way to connect the continued fraction representation to the number itself. Consider \(7/16 = [0,2,3,2] \). We use the three numbers \(2,3,2\) to construct a strip of three large triangles subdivided into \(2,3,2\) triangles respectively from left to right.

The rule to label the vertices is as follows: start on the left with labels \(1/0 \) and \( 0/1 \). Then use the mediant rule to compute the third label for each of the triangle in succession moving from left to right in the strip. It’s not an accident that the final label is \( 7/16 \). We can also do a similar procedure for fractions not in \( (0,1) \) by replacing \(0/1\) as the first label by \(a_0/1\).

Since the rule for labeling vertices along the horizontal strip is the mediant rule, each of the triangles in the strip is a triangle in the Farey diagram. Moreover, the strip of triangles are a sequence of adjacent triangles in the diagram. Also note that the corners of the blue triangles \( 0/1,1/2,3/7,7/16 \) are the convergents for the continued fraction representation of \(7/16\). We can consider a zigzag path along these convergents. Each successive vertex label \(p_i/q_i \) i.e. the convergent can be computed for \( p/q = [a_0,a_1,a_2,…,a_n] \) as follows:

\[\frac{p_i}{q_i} = \frac{a_i p_{i-1} + p_{i-2}}{a_i q_{i-1} + q_{i-2}}\]

One can understand this from the triangle strip diagram. The mediant rule is applied \( a_i \) times, when moving from two vertices of a blue triangle to the third vertex.

The zigzag path becomes a pinball type path in the Farey diagram. We start from \(1/0 \) to \(a_0/1\), then turn left across \(a_1 \) triangles, then right across \(a_2\) triangles and so on.

The convergents for a continued fraction \( p/q = [a_0,a_1,...,a_n] \) are the vertices along a zigzag path consisting of a finite sequence of edges in the Farey diagram, starting at \(1/0 \) and ending at \( p/q \). The path starts along the edge from \( 1/0 \) to \(a_0/1\), then turns left across a fan of \(a_1 \) triangles, the right across a fan of \(a_2\) triangles and so on, finally ending at \(p/q\).
We will show that the final point on the triangle strip construction \( p_n/q_n = p/q \). Consider the product, $$ P = \pmatrix{1 & a_0 \\ 0 & 1} \pmatrix{0 & 1 \\ 1 & a_1} \dots \pmatrix{0 & 1 \\ 1 & a_n}$$ If we consider multiplication from the left, $$ \pmatrix{1 & a_0 \\ 0 & 1}\pmatrix{0 & 1 \\ 1 & a_1} = \pmatrix{a_0 & 1 + a_0 a_1 \\ 1 & a_1} = \pmatrix{p_0 & p_1 \\ q_0 & q_1} $$ We can regard the columns as fractions. We can extend this and this finally obtain $$ P = \pmatrix{p_{n-1} & p_n \\ q_{n-1} & q_n} $$ Now let's work from right to left. Let \(r_i/s_i\) be the value of the tail \( [a_i,...,a_n] \). Then, $$ \frac{r_i}{s_i} = \frac{1}{a_i + \frac{r_{i+1}}{s_{i+1}}} = \frac{s_{i+1}}{a_i s_{i+1} + r_{i+1}} $$ Note that \( r_n/s_n = 1/a_n \), which is the second column of the last matrix in \(P\). If we consider, $$ \pmatrix{0 & 1 \\ 1 & a_i} \pmatrix{r_{i+1} \\ s_{i+1}} = \pmatrix{r_i \\ s_i} $$ and $$ \pmatrix{1 & a_0 \\ 0 & 1} \pmatrix{r_1 \\ s_1} = \pmatrix{r_1 + a_0 s_1 \\ s_1} = \pmatrix{ p\\ q} $$ Hence, the second column of \( P \) will be \( \pmatrix{p \\ q} \) which is that we want.

We can deduce another interesting fact from the preceding proof. Consider the fraction with \( a_0 = 0 \). If we reverse the order of \(a_i \), the denominator remains the same. To see this, take the transpose of \( P \).

Note that the determinant in each of matrices defining \( P \) is \( -1 \). Then \(p_{n-1} q_n - p_n q_{n-1} = (-1)^n \).

Determinants and the Farey Diagram

In the Farey diagram, two vertices labeled \(a/b\) and \(c/d\) are joined by an edge iff \( ad - bc = \pm 1 \).
The forward implication is easy and follows from the mediant rule inductively. For the converse, consider the following lemma,
The mediant rule for labeling vertices always produces labels \( a/b \) that are fractions in lowest terms.
This is easy. From the construction, \(a/b\) is connected to some \(c/d\) such that \( a d - b c = \pm 1\). If the \(a\) and \(b\) had a common factor, it would also divide \(\pm 1\). Hence, \(a/b\) is in lowest terms.
We shall now prove the converse of the theorem, by which there is an edge joining \(a/b\) to \( c/d \) whenever \( a d - b c = \pm 1 \). Consider \(a/b\) obtained by the mediant rule applied to \( c/d \) and \(e/f \), such that \( a = c + e, b = d + f \). If we consider the family of fractions, \( \frac{c + k a }{d + k b} \), where \( k \) runs over all integers, we can see that this will reproduce all the edges produced by starting with \(c/d\) and \(e/f\). This points to us that we must look for families of integer solutions to certain equations. Hence, consider the following lemma:
\(a\) and \(b\) are integers with no common divisor greater than \(1 \). If one solution of \(a y - b x = n \) is \( (x,y) = (c,d) \), then the general solution is \( (x,y) = (c + k a, d + k b) \) for \( k \) being an arbitrary integer.
The basic argument is from linear algebra, where a general solution to a inhomogeneous system is obtained by from a particular solution by adding the general solution of the associated homogeneous system. Consider \( a y - b x = n\). Since, \( a d - b c = n \), if we subtract these two equations, we get \(a (y - d) = b (x - c) \). As \(a\) and \(b\) are coprime, \(a \) must divide \( (x-c) \) and \( b \) must divide \( (y - d) \). Hence, we can conclude \( y - d = k a, x - c = k b \), which is our required result.
Now we can finish the proof of the converse. Using the lemma for the case \( n = \pm 1 \), and idea discusssed above, the edges in the Farey diagram with \(a/b\) at one endpoint account for all matrices \( \begin{pmatrix} a & x \\ b & y \end{pmatrix} \) with determinant \( a y - b x = \pm 1 \).

The Diophantine Equation \( a x + b y = n\)

The Euclidean algorithm and continued fractions can be used to compute all integer solutions of \( a x + b y = n \). We assume \(a \) or \(b \) are non-zero. Changing the signs of \(x\) and \(y \), we can rewrite the equation as \( a x - b y = n \), where \(a\) and \(b \) are both positive.

If \(a\) and \(b\) have gcd \(d > 1\), the for solutions to exist, \(d\) must divide \(n\). Then, we can divide both sides by \(d\), to get a new equation with \(a\) and \(b\) coprime.

We can find the general solution, if we know a particular solution. It suffices to the case \(n = 1\).

Solutions of \(a x - b y \) always exist when \(a\) and \(b\) are coprime, and a way to find one is to find an edge in the Farey diagram with \(a/b\) at one end. We can do this by finding the continued fraction representation for \(a/b\) using the triangle strip method. Then, the convergents, will be the required \( (x,y) \).

For example, consider solutions to \( 19 x - 7 y = 1\), whose continued fraction representation we have computed before. One of the covergents is \( 8/3 \). Hence, our required general solution is \( (x,y) = (3 + 7 k, 8 + 19 k) \).

From a geometric point of view, finding solution to \(a x + b y = n\) in the plane is equivalent to looking for intersections of the line with the integer lattice. It is possible for the line to never intersect the lattice when \( gcd(a,b) \) does not divide \( n\). In the case, it does intersect, there are infinitely many solutions and they are spaced at equal intervals along the line.

Infinite Continued Fractions

Irrational numbers can be represented as infinite continued fractions. For every infinite continued fraction \([a_0,a_1,a_2,\cdots] \), the convergents converge to a unique limit.

Every irrational number has an expression as an infinite continued fraction and this continued fraction is unique.
In the Farey diagram, consider the vertical line \(L\) going upward from a given irrational number \(x\) on the \(x\)-axis. The lower endpoint of \(L\) is not a vertex of the Farey diagram as \(x\) is irrational. As we move downward along \(L\) from infinity, we cross a sequence of triangles, entering each triangle by crossing its upper edge and exiting the triangle by crossing one of its two lower edges. This sequence of triangles is infinite. The left and right endpoints of the edges in the sequence must be approaching a single point \(x \). The zigzag path along this sequence gives us the continued fraction for \(x\). To show uniqueness, we argue that the geometrical path is unique.

How do we find the continued fraction representation for an irrational number? Here is a sequence of steps,

  1. Write \(x = a_0 + r_1 \), where \(a_0\) is an integer and \( 0 \leq r_1 < 1\).
  2. Write \(1/r_1 = a_1 + r_2 \), where \(a_1\) is an integer and \( 0 \leq r_2 < 1\).
  3. Write \(1/r_2 = a_2 + r_3 \), where \(a_2\) is an integer and \( 0 \leq r_3 < 1\). and so on.

Here, is how it works for \(x = \sqrt{2} \).

  1. \(\sqrt{2} = 1 + (\sqrt{2} - 1) \)
  2. \(\frac{1}{\sqrt{2} - 1} = \sqrt{2} + 1 = 2 + (\sqrt{2} - 1) \)
  3. \(\frac{1}{\sqrt{2} - 1} = \sqrt{2} + 1 = 2 + (\sqrt{2} - 1) \)

and so on.

Hence, \( \sqrt{2} = [1,2,2,2,\cdots] \). Doing a similar calculation, \( \sqrt{3} = [1,1,2,1,2,1,2,\cdots ] \). We will use an overhead bar to denote repetition. Hence, \(\sqrt{2} = [1,\overline{1,2} ] \).

It is true in general that for every positive integer \(n \) that is not a perfect square, the continued fraction for \( \sqrt{n} \) is of the form \([a_0,\overline{a_1,\cdots,a_k} ] \).

There are more curious facts which we will prove later:

  1. The last term of the period is always such that \(a_k = 2 a_0 \).

  2. If the last term of the period is omitted, the preceding terms in the period form a palindrome.

We naturally ask which irrational numbers have continued fractions that are eventually periodic. The answer is given by the following theorem by Lagrange:

(Lagrange's Theorem) The irrational numbers whose continued fraction are eventually periodic are exactly the numbers of the form \(a + b\sqrt{n} \), where \( a \) and \(b \) are rational, \( b \neq 0 \) and \(n \) is a positive integer that is not a square.

The numbers of \(a + b \sqrt{n} \) are called quadratic irrationals because they are roots of quadratic equations with integer coefficients.

What about other irrationals? They have patterns, but are not periodic.

\begin{align} e &= [2,1,2,1,1,4,1,1,6,1,\cdots] \newline \frac{e - 1}{e + 1} &= [2,6,10,14,\cdots] \newline \frac{e^2 - 1}{e^2 + 1} &= [1,3,5,7,\cdots]
\end{align}

But for \(\sqrt[3]{2}\) and \( \pi \), there are no known patters. The convergents to \( \pi \) gives the best rational approximations to \( \pi \) as \( 3,\frac{22}{7},\frac{333}{106},\frac{355}{113},\cdots \).