0 - Introduction

We aim to illustrate a wide variety of mathematical and computational tools for the large-scale behaviour of discrete objects. The intended audience includes both undergraduate and graduate students in mathematics and computer science, as well as more experienced researchers. We will investigate questions of computability (what can be computed) and complexity (how fast can it be computed).

This post surveys some of the issues that will be touched on. Our overarching goal is to take a sequence which

  • counts the number of objects in a combinatorial class (e.g., $a_n$ is the number of binary trees on $n$ nodes), or

  • captures the probability that an event occurs (e.g., $a_n$ is the probability that a permutation of $1,\dots,n$ has no fixed point), or

  • tracks the runtime of an algorithm (e.g., $a_n$ is the average number of comparisons performed by quicksort on permutations of $1,\dots,n$), or

  • encodes a combinatorial class with parameters (e.g., $a_{k,n}$ is the number of binary trees on $n$ nodes with $k$ leaves)

and say something interesting about the sequence. There are many ways to say something interesting about a sequence, including

1. Explicit Formula Enumeration. Many combinatorial sequences admit simple closed forms, meaning the terms of the sequence (say $a_n$) can be written as a simple expression in its indices ($n$), and one can aspire to find the simplest expression possible.

The start of Cayley’s famous 1889 article [C89] on the number of labeled trees.

The precise meaning of a simple (or even non-trivial) expression can be surprisingly hard to pin down1. We take a basic know-it-as-we-see-it approach, although nice expressions for us will usually involve nothing beyond the usual arithmetic operations, factorials, binomial coefficients, and indefinite summation and product.

2. Efficient Calculation. Instead of finding an explicit expression for the general term of a sequence, one might be happy calculating a specific term of high index (say $a_{100}$, or $a_{1000}$, or $a_{10^{10^{10}}}$) or all terms up to some index. During this process, information about a sequence is learned both by studying terms of large index and by the act of constructing an efficient algorithm for generating terms.

Perhaps the most famous example of an integer sequence is the Virahanka-Fibonacci sequence, defined by $v_0=v_1=1$ and
$$v_{n+2} = v_{n+1} + v_n \qquad (n \geq 0)$$

A long list of combinatorial objects enumerated by $v_n$ can be found on the OEIS (on-line encyclopedia of integer sequences). How fast can $v_N$ be calculated, as a function of $N?$ What is the largest term that can be determined on a modern computer in one minute? What about other sequences satisfying linear recurrences?

We will soon see that efficient ways of calculating terms of large index in such sequences, which is an order of magnitude faster than simply using the recurrence to generate terms, rely on well-known algebraic structures and operations (such as matrix multiplication and multiplication modulo a polynomial ideal).

3. Asymptotics. Of course, nice “explicit” formulas do not exist for all sequences of combinatorial interest. Even in cases where an explicit formula can be determined, such formulas can be cumbersome and hard to process (think of sequences with several nested sums, for instance). Because we care about large-scale behaviour, much of our discussions will address techniques to approximate a sequence $a_n$ as $n$ gets large. Given sequences $a_n$ and $b_n$, we write $a_n \sim b_n$ if the limit $a_n/b_n \rightarrow 1$ as $n\rightarrow\infty$. In this case, we say $a_n$ and $b_n$ have the same dominant asymptotic behaviour.

Consider the asymptotic approximations \begin{align} n! &\sim (n/e)^n \sqrt{2\pi n} \\[+3mm] p_n &\sim \frac{\exp\left(\pi\sqrt{2n/3}\right)}{4\sqrt{3} \; n} \\[+3mm] g_n &\sim 2^{\binom{2n}{n}}/n! \end{align} where $p_n$ denotes the number of integer partitions of $n$ and $g_n$ denotes the number of simple graphs on $n$ vertices.

Although $n!$ has a simple explicit characterization, this approximation (named after Stirling) allows one to get a better handle on how the factorial behaves for large $n$. More interestingly, $p_n$ has no simple explicit formula, but this asymptotic behaviour was famously determined by Hardy and Ramanujan [HR18] using analytic techniques (a result featured in a recent Hollywood movie). Furthermore, it has been open at least 1982 whether $g_n$ can be determined in polynomial time with respect to $n$ (see Wilf [W82]). The asymptotic behaviour of $g_n$ follows from probabilistic work of Erdős and Rényi [ER63].

We return to these examples in later posts.

Excerpt from Stirling’s 1730 text [S30]. English translation: The problem about finding the middle coefficient in a very large power of the binomial had been solved by De Moivre some years before I considered it: And it is probable that to this very day I would not have thought about it, unless that most esteemed man, Mr Alex. Cuming, had not stated that he very much doubted that it could be solved by Newton’s Method of Differences. See [T03] for a full English translation and additional historical context on Stirling’s work.

4. Limit Theorems. Perhaps the most striking results in this area are limit theorems for combinatorial objects with parameters. Most commonly, we examine the behaviour of terms in a multivariate sequence when one index is fixed and large. In light of the central limit theorem, it might not be surprising that many2 parameters in combinatorial classes exhibit Gaussian/normal behaviour as the size of the objects goes to infinity. We will explore several Gaussian limit laws in later posts, including the distribution of leaves in trees and the number of divisions performed by the Euclidean algorithm on pairs of polynomials with coefficients over a finite field.

This animation shows the cumulative results of counting the divisions performed by the Euclidean algorithm on 2001 random polynomial pairs $(f,g)$ over the finite field of order three with $f$ monic and $500 = \deg f > \deg g$. We see that the number of divisions approximates a limiting normal curve (which it approaches as the degree of $f$ increases). The code to produce this animation is below.

Data Structures for Combinatorial Sequences

In order to properly ask computational and complexity questions in combinatorics, we need appropriate data structures for the enumeration of combinatorial objects.

Given a univariate sequence $(a_n)=a_0,a_1,\dots$, the generating function of $(a_n)$ is the series $$ A(z) = \sum_{n=0}^{\infty} a_nz^n $$

in an indeterminate $z$. Given a multivariate sequence $(a_{\mathbf{i}}) = (a_{i_1,\dots,i_d})$, the generating function of $(a_{\mathbf{i}})$ is the multivariate series $$ A(\mathbf{z}) = A(z_1,\dots,z_d) = \sum_{i_1,\dots,i_d \geq 0} a_{i_1,\dots,i_d}z_1^{i_1}\cdots z_d^{i_d} $$ in the indeterminates $z_1,\dots,z_d$.

Although, in general, the generating function is a formal object (in a manner to be explained in a future post) it often coverges for $z$ in a disk around zero (in the complex plane) and thus represents the Taylor series of a complex-valued function at the origin. Many types of functions that arise frequently in combinatorics are naturally encoded by different algebraic and differential objects. For instance,

  • A polynomial or rational function with integer coefficients can be written explicitly. If $a_n$ satisfies a linear recurrence with constant coefficients (such as the Virahanka-Fibonacci numbers) then its generating function is rational.

  • An algebraic function is a function $f(z)$ which can be encoded by a polynomial $P(z,y) \in \mathbb{Z}[z,y]$ such that $P(z,f(z))=0$, and a finite number of terms of $f$.

  • A D-finite function is a differentiable function $f(z)$ which can be encoded by an annihilating linear differential equation $$ c_r(z)f^{(r)}(z) + \cdots + c_1(z)f’(z) + c_0(z)f(z)=0$$ with polynomial coefficients $c_j(z)$ and a finite number of terms of $f$. If $a_n$ satisfies a linear recurrence with polynomial coefficients (such as $a_{n+1}-na_n=0$) then its generating function is D-finite.

  • A D-algebraic function is a differentiable function $f(z)$ which can be encoded by a polynomial differential equation with polynomial coefficients (so that terms like $f’(z)f(z)$ are now allowed).

There are very interesting connections between various models of computation (finite automata, pushdown automata, Turing machines), and these classes of generating functions.

Similar encodings can be given for multivariate generating functions. In fact, multivariate functions often provide nice encodings for univariate sequences.

The (main) diagonal of a multivariate series $$ A(\mathbf{z}) = \sum_{i_1,\dots,i_d \geq 0} a_{i_1,\dots,i_d}z_1^{i_1}\cdots z_d^{i_d} $$ is the univariate series $(\Delta A)(z) = \sum_{n \geq 0} a_{n,\dots,n}z^n$. More generally, for $\mathbf{r} \in \mathbb{N}^d$, the $\mathbf{r}$-diagonal of $A$ is $(\Delta_{\mathbf{r}} A)(z) = \sum_{n \geq 0} a_{nr_1,\dots,nr_d}z^n$ (so the main diagonal is the $\mathbf{1}$-diagonal).

The $\mathbf{r}$-diagonals of multivariate rational functions form an important class of generating functions. In fact, all classes listed here form an increasing nest of complexity. This allows us to give a measure of complexity to a sequence by determining the smallest class its generating function is contained in.

The increasingly complicated classes of generating functions.

Given a high-level description of a combinatorial class, we will try to determine an encoding for its generating function (one of the encodings here, or something more complicated) and then manipulate this encoding to find closed formulas, efficiently determine terms of the sequence, find asymptotics, calculate limit theorems, etc. When this is possible, we try to determine how efficiently such tasks can be performed.

Indeed, although Taylor series coefficients of a rational function can be determined extremely efficiently, and both asymptotic and closed-form expressions for the coefficients can be found using partial fraction decomposition, it is undecidable to take a general D-algebraic function which has a convergent Taylor series at the origin and, from an annihilating polynomial differential equation, determine even whether its coefficient sequence exponentially grows or decays.

Illustrating tools for such analyses forms a large part of my work. These tools will come from every area of mathematics and computer science, including real and complex analysis, commutative algebra, algebraic and differential geometry, topology, and more. Not only does using such well-developed mathematical theories give powerful instruments for combinatorial problems, but this process also helps those trying to learn the underlying theories by giving motivation and putting the results in context.

Excerpt from Cauchy’s 1841 text [C41], showing how the coefficients of a power series expansion can be represented by a parametrized integral (note the typo). The Cauchy integral formula is crucial to modern methods of asymptotics.

Our second post illustrates these techniques on a concrete example from lattice path enumeration.

References for the Blog

These posts aim to give short presentations and illustrations of a wide selection of topics, often leaving full details to various references. The main reference is the textbook

Further details on many of these topics can also be found in

All of the works listed here have free manuscripts available at the given links. Additional references are given at the end of each post.

Computer Algebra Software

Computation forms a large part of our approach. We mainly use the open-source Sage computer algebra system, which is available for download on its website, and include the code used to generate each post. Sage can also be used through the online browser-based CoCalc service. Quick overviews can be found at this link and in the Sage Tutorial, and a full introduction to Sage can be found in the textbook Computational Mathematics with SageMath by P. Zimmermann and 15 coauthors.

Occasionally, we will use software packages for other computer algebra systems, including Maple, MAGMA, and Mathematica. These computer algebra systems are not open source, but many educational institutions provide free access to some or all of them.

Additional References for this Post

[C41] A.-L. Cauchy, Exercices d’Analyse et de Physique Mathématique 2, Paris, 1841.

[C89] A. Cayley. A theorem on trees. Quart. J. Pure Appl. Math. Vol 23, 376–378, 1889.

[ER63] P. Erdős and A. Rényi Asymmetric graphs. Acta Math. Acad. Sci. Hungar. 14, 295–315, 1963.

[HR18] G. H. Hardy and S. Ramanujan, Asymptotic Formulae in Combinatory Analysis. Proc. London Math. Soc. (2) Vol 17, 75–115, 1918.

[P18] I. Pak. Complexity problems in enumerative combinatorics. In Proc. Int. Cong. of Math. (3), Rio de Janeiro, 3139–3166, 2018.

[S30] J. Stirling. Methodus Differentialis: sive Tractatus de Summatione et Interpolatione Serierum Infinitarium. London, 1730.

[T03] I. Tweddle. James Stirling’s Methodus differentialis. Springer-Verlag London, 2003.

[W82] H. S. Wilf. What is an answer? Amer. Math. Monthly, Vol 89(5), 289–292, 1982.

Sage Code for this Post

The following is used to generate the Euclidean algorithm animation (using a bivariate generating function to be derived in a later posts).

# Define variables, size of coefficient field, number of plots
var('u z t n')
p = 2
N = 40

# Define the bivariate generating function
F = 1/(1-p*z-u*z*p*(p-1))
ser = F.series(z,N+1)

# Define the limiting density (for polynomials in GF(p))
density = p/(sqrt(2*pi*n*(p-1))) * exp(-n*(t/n-1+1/p)^2*p^2/(2*(p-1)))
dp = "\\frac{\\sqrt{2}}{\\sqrt{\\pi \\, n}} e^{-(2/n)(t-n/2)^2}"

# Iterate over maximum degree (k) of polynomial pair
an = [Graphics()]
for k in range(1,N+1):
    # Get distribution of divisions for pairs with max = k
    LST = ser.coefficient(z,k).coefficients()
    sm = sum([pt[0] for pt in LST])
    LST2 = [[pt[1],pt[0]/sm] for pt in LST]

    # Add the number of divisions to each 
    pl = Graphics()
    for pt in LST2:
        pl += point(pt, size=20)
    # Add text to the graph and set plotting range
    pl += text("Max degree $= %s$"  %(k),(0.035,0.97), axis_coords=True, fontsize=14, horizontal_alignment="left")
    pl += text("$%s$"  %(dp),(1,0.97), axis_coords=True, fontsize=18, horizontal_alignment="right", color="red")
    pl += plot(density.subs(n=k),t,(0,k),color="red")
    an += [pl]

# Animate sequence of plots
ani = animate(an[1:])

# Save animation as a gif (requires ImageMagick)

  1. Interesting discussions on what it means for a combinatorial formula to be simple or useful can be found in [W82] and [P18]. ↩︎

  2. For instance, Tables 1 and 2 of Hwang’s introduction to a chapter of the (in progress) collected works of Philippe Flajolet list 26 Gaussian limit laws proven by Flajolet for distinct combinatorial objects, including patterns in words, cost analyses of various sorting algorithms, parameters in different kinds of trees, polynomials with restricted coefficients, and balls in urns, among other examples. ↩︎