Table of Contents
Fetching ...

A Stirling-type formula for the distribution of the length of longest increasing subsequences

Folkmar Bornemann

TL;DR

This work advances the numerical study of the distribution of the length $L_n$ of the longest increasing subsequence in random permutations by introducing a Stirling-type formula based on Hayman’s $H$-admissibility. By proving $H$-admissibility of the generating functions $f_l(z)$ and deriving precise asymptotics via a multidimensional Laplace approach, the authors obtain a uniform, multiplicative approximation for ${\mathbb P}(L_n\le l)$ that remains accurate for extremely large $n$, and they connect this to hard-edge random matrix theory to enable efficient computation. The approach yields explicit finite-size corrections to the random-matrix limit, including conjectured first and second correction terms $F_{2,1}$ and $F_{2,2}$, and provides refined expansions for $\mathbb{E}(L_n)$ and ${\rm Var}(L_n)$ with accurate coefficient values. Numerically, the method delivers highly accurate results with far superior speed compared to Monte Carlo and without sacrificing uniformity in $l$, thereby bridging exact combinatorial tables (up to $n=1000$) with the universal Tracy–Widom limit and enabling precise finite-size analyses. This framework offers a powerful tool for exploring finite-size effects in LIS distributions and related combinatorial-probabilistic models.

Abstract

The discrete distribution of the length of longest increasing subsequences in random permutations of $n$ integers is deeply related to random matrix theory. In a seminal work, Baik, Deift and Johansson provided an asymptotics in terms of the distribution of the scaled largest level of the large matrix limit of GUE. As a numerical approximation, however, this asymptotics is inaccurate for small $n$ and has a slow convergence rate, conjectured to be just of order $n^{-1/3}$. Here, we suggest a different type of approximation, based on Hayman's generalization of Stirling's formula. Such a formula gives already a couple of correct digits of the length distribution for $n$ as small as $20$ but allows numerical evaluations, with a uniform error of apparent order $n^{-2/3}$, for $n$ as large as $10^{12}$; thus closing the gap between a table of exact values (compiled for up to $n=1000$) and the random matrix limit. Being much more efficient and accurate than Monte-Carlo simulations, the Stirling-type formula allows for a precise numerical understanding of the first few finite size correction terms to the random matrix limit. From this we derive expansions of the expected value and variance of the length, exhibiting several more terms than previously put forward.

A Stirling-type formula for the distribution of the length of longest increasing subsequences

TL;DR

This work advances the numerical study of the distribution of the length of the longest increasing subsequence in random permutations by introducing a Stirling-type formula based on Hayman’s -admissibility. By proving -admissibility of the generating functions and deriving precise asymptotics via a multidimensional Laplace approach, the authors obtain a uniform, multiplicative approximation for that remains accurate for extremely large , and they connect this to hard-edge random matrix theory to enable efficient computation. The approach yields explicit finite-size corrections to the random-matrix limit, including conjectured first and second correction terms and , and provides refined expansions for and with accurate coefficient values. Numerically, the method delivers highly accurate results with far superior speed compared to Monte Carlo and without sacrificing uniformity in , thereby bridging exact combinatorial tables (up to ) with the universal Tracy–Widom limit and enabling precise finite-size analyses. This framework offers a powerful tool for exploring finite-size effects in LIS distributions and related combinatorial-probabilistic models.

Abstract

The discrete distribution of the length of longest increasing subsequences in random permutations of integers is deeply related to random matrix theory. In a seminal work, Baik, Deift and Johansson provided an asymptotics in terms of the distribution of the scaled largest level of the large matrix limit of GUE. As a numerical approximation, however, this asymptotics is inaccurate for small and has a slow convergence rate, conjectured to be just of order . Here, we suggest a different type of approximation, based on Hayman's generalization of Stirling's formula. Such a formula gives already a couple of correct digits of the length distribution for as small as but allows numerical evaluations, with a uniform error of apparent order , for as large as ; thus closing the gap between a table of exact values (compiled for up to ) and the random matrix limit. Being much more efficient and accurate than Monte-Carlo simulations, the Stirling-type formula allows for a precise numerical understanding of the first few finite size correction terms to the random matrix limit. From this we derive expansions of the expected value and variance of the length, exhibiting several more terms than previously put forward.
Paper Structure (16 sections, 10 theorems, 134 equations, 6 figures, 3 tables)

This paper contains 16 sections, 10 theorems, 134 equations, 6 figures, 3 tables.

Key Result

Theorem 2.1

Let $f(z) = \sum_{n=0}^\infty a_nz^n$ and $g(z)$ be entire functions and let $p(z)$ denote a polynomial with real coefficients. I. If $f$ is $H$-admissible, then: II. If $f$ and $g$ are $H$-admissible, then: III. If $f$ has genus zeroBy definition, an entire function $f$ has genus zero if it is a polynomial or if it can be represented as a convergent infinite product of the form f(z) = c z^m \pr

Figures (6)

  • Figure 1: Discrete distribution of $L_n$ for $n=10^5$ near its mode vs. the random matrix limit given by the leading order terms in \ref{['eq:BDJ99']} and \ref{['eq:PDFexpansion']} (solid red line); here and in the figures below, discrete distributions are shown as blue bars centered at the integers. Left: CDF ${\mathbb P}( L_n \leqslant l)$; right: PDF ${\mathbb P}( L_n = l)$. The discrete distributions were computed using the Stirling-type formula \ref{['eq:stirling']}, with additive errors estimated to be smaller than $10^{-5}$, cf. Fig. \ref{['fig:error']}, which is well below plotting accuracy.
  • Figure 2: Display of the notable inaccuracy of the random matrix limit \ref{['eq:BDJ99']} for $n=1000$ (see the contrast with Fig. \ref{['fig:pdf100000']} for $n=10^5$). The discrete distribution of $L_n$ is shown near its mode vs. the random matrix limit given by the leading order terms in \ref{['eq:CDFexpansion']} and \ref{['eq:PDFexpansion']} (solid red line). Left: CDF ${\mathbb P}( L_n \leqslant l)$; right: PDF ${\mathbb P}( L_n = l)$. The exact values of the distribution of $L_n$ and their approximation by the Stirling-type formula \ref{['eq:stirling']} differ just by additive errors of the order $10^{-4}$ (see Fig. \ref{['fig:error']}), which is well below plotting accuracy.
  • Figure 3: Maximum absolute (i.e., additive) errors of various approximations to (left panel) the CDF ${\mathbb P}(L_n\leqslant l)$ and (right panel) the PDF ${\mathbb P}(L_n= l)$ in a double logarithmic scaling, based on tabulated exact values up to $n=1000$, cf. Sect. \ref{['subsect:exact']}; solid lines are fits of the form $c_1 n^{-\alpha_1} + c_2 n^{-\alpha_2} + c_3 n^{-\alpha_3}$ (CDF) and $n^{-1/6}\cdot(c_1 n^{-\alpha_1} + c_2 n^{-\alpha_2} + c_3 n^{-\alpha_3})$ (PDF) to the points in display. Red $+$: error of leading order terms (random matrix limit) in \ref{['eq:CDFexpansion']}, \ref{['eq:PDFexpansion']}; $\alpha = (\tfrac{1}{3},\tfrac{2}{3},1)$. Green $\circ$: error of expansions \ref{['eq:CDFexpansion']}, \ref{['eq:PDFexpansion']} truncated after the first finite size correction term; $\alpha=(\tfrac{2}{3}, 1, \tfrac{4}{3})$, where $F_{2,1}$ has been approximated as in Fig. \ref{['fig:1st']}. Blue $\bullet$: error of the Stirling-type formula \ref{['eq:stirling']}; $\alpha = (\tfrac{2}{3}, 1, \tfrac{4}{3})$.
  • Figure 4: Rescaled differences between the distributions of $L_n$ and their expansions truncated after the leading order term (i.e., the random matrix limit)---see \ref{['eq:CDFexpansion']} for the CDF resp. \ref{['eq:PDFexpansion']} for the PDF; data points (to avoid clutter just every $5^\text{th}$ is displayed) have been calculated using the Stirling-type formula \ref{['eq:stirling']} for $n=10^6$ (red $+$), $n=10^8$ (green $\circ$), $n=10^{10}$ (blue $\bullet$). Left: CDF errors rescaled by $n^{1/3}$, horizontal axis is $t=(l-2\sqrt{n})/n^{1/6}$, cf. arxiv.2205.05257 for a similar figure with data from Monte Carlo simulations for $n=2\cdot 10^4$ and $n=10^5$. The solid line is a polynomial $\tilde{F}_{2,1}(t)$ of degree $64$ fitted to the $836$ data points for $n=10^{10}$ with $-8\leqslant t \leqslant 10$; it approximates $F_{2,1}(t)$ in that interval. Right: PDF errors rescaled by $n^{1/2}$, horizontal axis is $t=(l-\frac{1}{2}-2\sqrt{n})/n^{1/6}$. The solid line displays the function $\tilde{F}_{2,1}'(t)+F_2"'(t)/24$ as an approximation of $F_{2,1}'(t)+F_2"'(t)/24$, with the polynomial $\tilde{F}_{2,1}(t)$ taken from the left panel. The dotted line shows the term $\tilde{F}_{2,1}'(t)$ only.
  • Figure 5: Left: the auxiliary function $a_5(r)$ (blue solid line) associated with the generating function $f_5(z)$, together with the asymptotics $a_5(r)= r + O(r^6)$ as $r\to 0$ (green dotted line) and $a_5(r)= 5 r^{1/2} + O(1)$ as $r\to \infty$ (red dashed line). Right: Illustration of the approximation \ref{['eq:CLT']} of the Boltzmann probabilities (blue bars) associated with the generating function $f_5(z)$ for intensity $r_{5,15}\approx 18.23$, cf. the notation in \ref{['eq:stirling']}. The normal distribution (red solid line) has mean $a_5(r_{5,15}) = 15$ (cf. the left panel) and variance $b_5(r_{5,15})\approx 10.80$.
  • ...and 1 more figures

Theorems & Definitions (30)

  • Remark : added in proof
  • Theorem 2.1: Hayman 1956
  • Remark 2.1
  • Lemma 2.1
  • proof
  • Remark 2.2
  • Theorem 2.2
  • proof
  • Remark 2.3
  • Lemma 2.2
  • ...and 20 more