CO 739: Analytic and Algorithmic Combinatorics (Winter 2021)

This webpage will not be updated during the term. All course materials will be posted on LEARN.

Course details can be found in the course syllabus.

Organization: Because of the ongoing pandemic, this course will meet online only. Organization of the course will be discussed with students during the first week of lecture. The current plan is to offer up to two hours of prerecorded lectures per week, together with an optional one hour live session for students to ask questions. I expect to post lectures and other material on LEARN.

Assessment: Students will be evaluated on three homework assignments (each 20% of final grade) and a final project (40% of final grade). For the final project students will write a report and make a short presentation to the class about a research paper related to their interests and the course content (other options, like a historical survey or coding project, are also possible). Deadlines for the assignments and project will be set after consultation with the class at the beginning of the semester.

Prerequisites: This course may be of interest to students in a variety of areas (combinatorics, computer science, pure math, etc.) who will come in with different backgrounds. Students should be comfortable with the basics of real analysis (sequences and series) and ideally will have previous exposure to basic algebra (rings and fields) and complex analysis (analytic functions, Cauchy residue theorem). Short background lessons will be provided for students wanting to review (or needing to catch up on) some of the necessary background material. Interested students unsure of their background should email the instructor for more information. Students are not expected to have previous exposure to computer algebra (see below for more details on computing in the course).

References: Our main reference is the textbook

Further references can be found in the syllabus. All works have free manuscripts available online.

Examples of Course Topics

These examples can be explored interactively in the browser (after a short load time) by clicking on this link (a non-interactive version is here). Note that randomness will change the output each run.

1. Random Walks in Cones

Consider walks on the steps North, South, East, and West, which start at the origin and stay in the quadrant1 $\mathbb{N}^2$. How many such walks are there on $n$ steps? Equivalently (since there are $4^n$ unrestricted walks) what is the probability that a walk of length $n$ never leaves the first quadrant?

100 random walks of length 300, greyed out once they leave the first quadrant. Of the 100 starting walks, only one stays in the first quadrant after all 300 steps. Readers can generate their own animations using the interactive link above. This animation resets every 7 seconds. If it does not play in your browser click here.

Using tools like creative telescoping, numeric analytic continuation, and analytic combinatorics in several variables we will see, among other things, methods to efficiently compute the number of walks of large size and techniques to determine asymptotic behaviour (it turns out the probability that a walk of length $n$ stays in $\mathbb{N}^2$ goes to zero like $\frac{4}{\pi n}$). Further details on this example can be found in this post.

2. Complexity of Integer Sequences

One of the most famous examples of an integer sequence is the Virahanka-Fibonacci sequence $f_n$ defined by the recurrence $$ f_{n+2} = f_{n+1} + f_n, \qquad f_0=f_1=1.$$

How fast can we determine the term $f_N$ for large $N$? Using the recurrence directly, one must perform $N$ additions. Taking the growing size of the sequence terms into account, computing $f_N$ in this manner uses approximately $O(N^2\log N)$ operations on fixed-size integers. Can we do better?

Yes. For instance, it can be seen from the recurrence that for all $N\in\mathbb{N}$,

$$\begin{pmatrix} f_{N+1} \\ f_N \end{pmatrix} = \begin{pmatrix}1&1\\1&0\end{pmatrix}^N \begin{pmatrix} f_1 \\ f_0 \end{pmatrix}.$$

Using binary squaring to compute the $N$th matrix power, we can obtain $f_N$ using around $O(N\log^2N)$ operations on fixed-size integers. Can we do better? (Note: Diagonalizing the matrix above gives $f_N$ as an explicit sum of two powers of algebraic numbers. This representation is great for determining asymptotic behaviour, but terrible for actually computing terms.)

Again, yes (to a lesser degree). If $R$ is the quotient ring $R = \mathbb{Q}[x]/(x^2-x-1)$ then $f_N$ equals the the coefficient of $x$ when $x^{N+1}$ is represented by a linear polynomial in $R$ (we will see why in class). Thus, we can compute $f_N$ by exponentiating polynomials in quotient rings, which is more efficient than taking matrix powers. (This approach also has complexity around $O(N\log^2N)$ but the hidden constants are lower, especially for linear recurrences with higher shifts in index)

Time (in microseconds) needed in Sage to compute $f_N$ using binary powering of a matrix (red) and binary powering in a polynomial quotient ring (blue). The essentially linear complexity of these approaches can be directly observed. Naively computing terms with the recurrence would not terminate for $N$ of this size in any reasonable time. Readers can create their own timings using the interactive link above.

It is also of note that the (essentially) linear complexity comes from the growth of the integers involved. Using the quotient ring approach while computing modulo a fixed integer results in a logarithmic complexity! (For instance, it takes only 5 milliseconds to compute $F_{10^{100}}$ mod $13 = 5$.)

3. Limit Theorems for Combinatorial Objects

Instead of simply counting combinatorial objects by size, it is often illuminating to also keep track of different parameters. This is especially useful in the analysis of algorithms and computer algebra, as the number of polynomials of fixed degree whose coefficients come from a finite field (like the integers modulo a prime) form a finite set.

For instance, let $e_{n,k}$ denote the number of pairs of polynomials $(f,g)$ with coefficients in $\mathbb{Z}_p$ such that $f$ is monic, $n = \deg f > \deg g$, and the Euclidean gcd algorithm applied to $(f,g)$ takes $k$ steps to terminate. The Symbolic Method for generating functions shows how to write the bivariate sequence $e_{n,k}$ as the power series coefficients of a rational function (depending on the prime $p$): $$ \sum_{n,k \geq 0} e_{k,n} u^kz^n = \frac{1}{1-pz - p(p-1)uz}. $$ From here, analytic methods imply that for $n$ fixed and large the sequence $e_{k,n}$ approaches a normal distribution. In other words, as the degrees of the polynomials approach infinity the number of divisions performed by the Euclidean algorithm satsifies a central limit theorem.

A running count of the number of divisions performed when running the Euclidean algorithm on pairs of polynomials in $\mathbb{Z}_3[x]$ with the higher degree polynomial monic of degree 500, compared to the expected distribution from the limit curve. Readers can experiment with different primes and polynomial degrees using the interactive link above. This animation resets every 7 seconds. If it does not play in your browser click here.

Computer Algebra Software and Expectations

Students will be expected to do some basic coding on their assignments. You are free to use any computer algebra system or programming language you wish (SageMath, Maple, Mathematica, Julia, Python, etc.). Because it is free, open source, and even possible to run in your browser with no installation, I recommend using Sage.

A short tutorial for Sage can be run in the browser at this link (after a short load time of up to a minute) or downloaded at this link (it can also be viewed as a static HTML page here).

We will mainly use Sage for examples in class. Possibilities for running Sage yourself include

  1. Installing Sage on your computer from its website (can be surprisingly tricky)
  2. Signing up for a free account on the online browser-based CoCalc service
  3. Using Waterloo’s JupyerHub website (Requires Waterloo login)
  4. Using the cloud environment at this link (Note: this does not require any signup or download, but there is an initial load time of up to a minute. Your work does not carry over on page reload – download anything you want to save!)

In addition to the packages that come built-in to Sage, we will also make use of the ore_algebra package, available at this link.

  1. Walks in orthants (such as a two-dimensional quadrant) are important in queuing theory (among many other areas) as they model jobs/people entering and being processed out of queues and the number of jobs/people in a line can never be negative. ↩︎