Explore Papers
Read papers in beautiful, interactive HTML
Latest Papers
Page 1
Hilbert's sixth problem: derivation of fluid equations via Boltzmann's kinetic theory
Authors: Yu Deng, Zaher Hani, Xiao Ma
In this paper, we rigorously derive the fundamental PDEs of fluid mechanics, such as the compressible Euler and incompressible Navier-Stokes-Fourier equations, starting from the hard sphere particle systems undergoing elastic collisions. This resolves Hilbert's sixth problem, as it pertains to the program of deriving the fluid equations from Newton's laws by way of Boltzmann's kinetic theory. The proof relies on the derivation of Boltzmann's equation on 2D and 3D tori, which is an extension of our previous work (arXiv:2408.07818).
A sharp version of Price's law for wave decay on asymptotically flat spacetimes
Authors: P. Hintz
We prove Price's law with an explicit leading order term for solutions $\phi(t,x)$ of the scalar wave equation on a class of stationary asymptotically flat $(3+1)$-dimensional spacetimes including subextremal Kerr black holes. Our precise asymptotics in the full forward causal cone imply in particular that $\phi(t,x)=c t^{-3}+\mathcal O(t^{-4+})$ for bounded $|x|$, where $c\in\mathbb C$ is an explicit constant. This decay also holds along the event horizon on Kerr spacetimes and thus renders a result by Luk-Sbierski on the linear scalar instability of the Cauchy horizon unconditional. We moreover prove inverse quadratic decay of the radiation field, with explicit leading order term. We establish analogous results for scattering by stationary potentials with inverse cubic spatial decay. On the Schwarzschild spacetime, we prove pointwise $t^{-2 l-3}$ decay for waves with angular frequency at least $l$, and $t^{-2 l-4}$ decay for waves which are in addition initially static. This definitively settles Price's law for linear scalar waves in full generality. The heart of the proof is the analysis of the resolvent at low energies. Rather than constructing its Schwartz kernel explicitly, we proceed more directly using the geometric microlocal approach to the limiting absorption principle pioneered by Melrose and recently extended to the zero energy limit by Vasy.
Mask R-CNN
Authors: Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick
We present a conceptually simple, flexible, and general framework for object instance segmentation. Our approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance. The method, called Mask R-CNN, extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition. Mask R-CNN is simple to train and adds only a small overhead to Faster R-CNN, running at 5 fps. Moreover, Mask R-CNN is easy to generalize to other tasks, e.g., allowing us to estimate human poses in the same framework. We show top results in all three tracks of the COCO suite of challenges, including instance segmentation, bounding-box object detection, and person keypoint detection. Without bells and whistles, Mask R-CNN outperforms all existing, single-model entries on every task, including the COCO 2016 challenge winners. We hope our simple and effective approach will serve as a solid baseline and help ease future research in instance-level recognition. Code has been made available at: https://github.com/facebookresearch/Detectron
Anomalous diffusion via iterative quantitative homogenization: an overview of the main ideas
Authors: Scott Armstrong, V. Vicol
Anomalous diffusion is the fundamental ansatz of phenomenological theories of passive scalar turbulence, and has been confirmed numerically and experimentally to an extraordinary extent. The purpose of this survey is to discuss our recent result, in which we construct a class of incompressible vector fields that have many of the properties observed in a fully turbulent velocity field, and for which the associated scalar advection-diffusion equation generically displays anomalous diffusion. Our main contribution is to propose an analytical framework in which to study anomalous diffusion via a backward cascade of renormalized eddy viscosities.
The Two-Phase Membrane Problem -- an Intersection-Comparison Approach to the Regularity at Branch Points
Authors: Henrik Shahgholian, Georg S. Weiss
For the two-phase membrane problem $ \Delta u = {\lambda_+\over 2} \chi_{\{u>0\}} - {\lambda_-\over 2} \chi_{\{u<0\}} ,$ where $\lambda_+> 0$ and $\lambda_->0 ,$ we prove in two dimensions that the free boundary is in a neighborhood of each ``branch point'' the union of two $C^1$-graphs. We also obtain a stability result with respect to perturbations of the boundary data. Our analysis uses an intersection-comparison approach based on the Aleksandrov reflection. In higher dimensions we show that the free boundary has finite $(n-1)$-dimensional Hausdorff measure.
A counterexample to the periodic tiling conjecture
Authors: Rachel Greenfeld, Terence Tao
The periodic tiling conjecture asserts that any finite subset of a lattice $\mathbb{Z}^d$ which tiles that lattice by translations, in fact tiles periodically. In this work we disprove this conjecture for sufficiently large $d$, which also implies a disproof of the corresponding conjecture for Euclidean spaces $\mathbb{R}^d$. In fact, we also obtain a counterexample in a group of the form $\mathbb{Z}^2 \times G_0$ for some finite abelian $2$-group $G_0$. Our methods rely on encoding a "Sudoku puzzle" whose rows and other non-horizontal lines are constrained to lie in a certain class of "$2$-adically structured functions," in terms of certain functional equations that can be encoded in turn as a single tiling equation, and then demonstrating that solutions to this Sudoku puzzle exist, but are all non-periodic.
Almost all orbits of the Collatz map attain almost bounded values
Authors: Terence Tao
Define the \emph{Collatz map} $\mathrm{Col} : \mathbb{N}+1 \to \mathbb{N}+1$ on the positive integers $\mathbb{N}+1 = \{1,2,3,\dots\}$ by setting $\mathrm{Col}(N)$ equal to $3N+1$ when $N$ is odd and $N/2$ when $N$ is even, and let $\mathrm{Col}_{\min}(N) := \inf_{n \in \mathbb{N}} \mathrm{Col}^n(N)$ denote the minimal element of the Collatz orbit $N, \mathrm{Col}(N), \mathrm{Col}^2(N), \dots$. The infamous \emph{Collatz conjecture} asserts that $\mathrm{Col}_{\min}(N)=1$ for all $N \in \mathbb{N}+1$. Previously, it was shown by Korec that for any $\theta > \frac{\log 3}{\log 4} \approx 0.7924$, one has $\mathrm{Col}_{\min}(N) \leq N^\theta$ for almost all $N \in \mathbb{N}+1$ (in the sense of natural density). In this paper we show that for \emph{any} function $f : \mathbb{N}+1 \to \mathbb{R}$ with $\lim_{N \to \infty} f(N)=+\infty$, one has $\mathrm{Col}_{\min}(N) \leq f(N)$ for almost all $N \in \mathbb{N}+1$ (in the sense of logarithmic density). Our proof proceeds by establishing an approximate transport property for a certain first passage random variable associated with the Collatz iteration (or more precisely, the closely related Syracuse iteration), which in turn follows from estimation of the characteristic function of a certain skew random walk on a $3$-adic cyclic group at high frequencies. This estimation is achieved by studying how a certain two-dimensional renewal process interacts with a union of triangles associated to a given frequency.
Critical long-range percolation III: The upper critical dimension
Authors: Tom Hutchcroft
In long-range percolation on $\mathbb{Z}^d$, points $x$ and $y$ are connected by an edge with probability $1-\exp(-\beta\|x-y\|^{-d-\alpha})$, where $\alpha>0$ is fixed and $\beta \geq 0$ is a parameter. As $d$ and $\alpha$ vary, the model is conjectured to exhibit eight qualitatively different second-order critical behaviours, with a transition between mean-field and low-dimensional regimes when $d=\min\{6,3\alpha\}$, a transition between long- and short-range regimes at a crossover value $\alpha_c(d)$, and with various logarithmic corrections at the boundaries between these regimes. This is the second of three papers developing a rigorous theory of the model's critical behavior in five of these eight regimes, including all long-range (LR) and high-dimensional (HD) regimes. Here, we analyze the model at its upper critical dimension $d=3\alpha<6$. We prove the hydrodynamic condition holds, which allows us to apply our first paper's RG analysis to deduce that the model has the same superprocess scaling limits as in high dimension, after accounting for slowly varying corrections to scaling. We then compute the precise logarithmic corrections to scaling by analyzing the RG flow to second order. Our results yield in particular that for $d=3\alpha < 6$ the critical volume tail is \[ \mathbb{P}_{\beta_c}(|K|\geq n) \sim C \frac{(\log n)^{1/4}}{\sqrt{n}} \] as $n\to \infty$, while the critical two- and three-point functions are \[ \mathbb{P}_{\beta_c}(x\leftrightarrow y) \asymp \|x-y\|^{-d+\alpha} \; \text{ and } \; \mathbb{P}_{\beta_c}(x\leftrightarrow y \leftrightarrow z) \asymp \sqrt{\frac{\|x-y\|^{-d+\alpha}\|y-z\|^{-d+\alpha}\|z-x\|^{-d+\alpha}}{\log(1+\min\{\|x-y\|,\|y-z\|,\|z-x\|\})}}. \] These logarithmic corrections match those in hierarchical percolation but differ from those conjectured for nearest-neighbour percolation on $\mathbb{Z}^6$.
Critical long-range percolation II: Low effective dimension
Authors: Tom Hutchcroft
In long-range percolation on $\mathbb{Z}^d$, points $x$ and $y$ are connected by an edge with probability $1-\exp(-\beta\|x-y\|^{-d-\alpha})$, where $\alpha>0$ is fixed and $\beta \geq 0$ is a parameter. As $d$ and $\alpha$ vary, the model is conjectured to exhibit eight qualitatively different second-order critical behaviours, with a transition between mean-field and low-dimensional regimes when $d=\min\{6,3\alpha\}$, a transition between long- and short-range regimes at a crossover value $\alpha_c(d)$, and with various logarithmic corrections at the boundaries between these regimes. This is the second of three papers developing a rigorous theory of the model's critical behavior in five of these eight regimes, including all long-range (LR) and high-dimensional (HD) regimes. We focus on the long-range low-dimensional (LR-LD) regime $d/3<\alpha<\alpha_c(d)$, where the model is below its upper critical dimension. Since computing $\alpha_c(d)$ for $2<d<6$ appears to be beyond the scope of current techniques, we give an axiomatic definition of the LR regime which we prove holds for $\alpha <1$. Using this, we prove up-to-constants estimates for the critical and slightly subcritical two-point function in the LR regime and for the volume tail and $k$-point function in the LR-LD regime. We deduce that the critical exponents satisfy the identities \[ \eta = 2-\alpha, \qquad \gamma = (2-\eta)\nu, \qquad \text{ and } \qquad \Delta = \nu d_f \] in the LR regime (if $\gamma$, $\nu$, or $\Delta$ is well-defined) and that $\delta$ and $d_f$ follow the hyperscaling identities \[ \delta = \frac{d+\alpha}{d-\alpha} \qquad \text{ and } \qquad d_f = \frac{d+\alpha}{2} \] in the LR-LD regime. Our results are suggestive of conformal invariance in the LR-LD regime, with the critical $k$-point function matching an explicit M\"obius-covariant function up-to-constants.
Critical long-range percolation I: High effective dimension
Authors: Tom Hutchcroft
In long-range percolation on $\mathbb{Z}^d$, points $x$ and $y$ are connected by an edge with probability $1-\exp(-\beta\|x-y\|^{-d-\alpha})$, where $\alpha>0$ is fixed and $\beta \geq 0$ is a parameter. As $d$ and $\alpha$ vary, the model is conjectured to exhibit eight qualitatively different second-order critical behaviours, with a transition between mean-field and low-dimensional regimes when $d=\min\{6,3\alpha\}$, a transition between long- and short-range regimes at a crossover value $\alpha_c(d)$, and with various logarithmic corrections at the boundaries between these regimes. This is the first of a series of three papers developing a rigorous theory of the model's critical behavior in five of these eight regimes, including all long-range (LR) and high-dimensional (HD) regimes. In this paper, we introduce our non-perturbative real-space renormalization group method and apply this method to analyze the HD regime $d>\min\{6,3\alpha\}$. In particular, we compute the tail of the cluster volume and establish the superprocess scaling limits of the model, which transition between super-Levy and super-Brownian behavior when $\alpha=2$. All our results hold unconditionally for $d> 3\alpha$, without any perturbative assumptions on the model; beyond this regime, when $d> 6$ and $\alpha \geq d/3$, they hold under the assumption that appropriate two-point function estimates hold as provided for spread-out models by the lace expansion. Our results on scaling limits also hold (with possible slowly-varying corrections to scaling) in the critical-dimensional regime with $d=3\alpha<6$ subject to a marginal-triviality condition we call the hydrodynamic condition; this condition is verified in the third paper in this series, in which we also compute the precise logarithmic corrections to mean-field scaling when $d=3\alpha<6$.
Stable minimal hypersurfaces in $\mathbf{R}^5$
Authors: Otis Chodosh, Chao Li, Paul Minter, Douglas Stryker
We show that a complete, two-sided, stable minimal hypersurface in $\mathbf{R}^5$ is flat.
The Erdos discrepancy problem
Authors: Terence Tao
We show that for any sequence $f: {\bf N} \to \{-1,+1\}$ taking values in $\{-1,+1\}$, the discrepancy $$ \sup_{n,d \in {\bf N}} \left|\sum_{j=1}^n f(jd)\right| $$ of $f$ is infinite. This answers a question of Erd\H{o}s. In fact the argument also applies to sequences $f$ taking values in the unit sphere of a real or complex Hilbert space. The argument uses three ingredients. The first is a Fourier-analytic reduction, obtained as part of the Polymath5 project on this problem, which reduces the problem to the case when $f$ is replaced by a (stochastic) completely multiplicative function ${\bf g}$. The second is a logarithmically averaged version of the Elliott conjecture, established recently by the author, which effectively reduces to the case when ${\bf g}$ usually pretends to be a modulated Dirichlet character. The final ingredient is (an extension of) a further argument obtained by the Polymath5 project which shows unbounded discrepancy in this case.
Sendov's conjecture for sufficiently high degree polynomials
Authors: Terence Tao
Sendov's conjecture asserts that if a complex polynomial $f$ of degree $n \geq 2$ has all of its zeroes in closed unit disk $\{ z: |z| \leq 1 \}$, then for each such zero $\lambda_0$ there is a zero of the derivative $f'$ in the closed unit disk $\{ z: |z-\lambda_0| \leq 1 \}$. This conjecture is known for $n < 9$, but only partial results are available for higher $n$. We show that there exists a constant $n_0$ such that Sendov's conjecture holds for $n \geq n_0$. For $\lambda_0$ away from the origin and the unit circle we can appeal to the prior work of D\'egot and Chalebgwa; for $\lambda_0$ near the unit circle we refine a previous argument of Miller (and also invoke results of Chijiwa when $\lambda_0$ is extremely close to the unit circle); and for $\lambda_0$ near the origin we introduce a new argument using compactness methods, balayage, and the argument principle.
Engineering flexible machine learning systems by traversing functionally-invariant paths
Authors: Guruprasad Raghavan, Bahey Tharwat, Surya Narayanan Hari, Dhruvil Satani, Matt Thomson
Transformers have emerged as the state of the art neural network architecture for natural language processing and computer vision. In the foundation model paradigm, large transformer models (BERT, GPT3/4, Bloom, ViT) are pre-trained on self-supervised tasks such as word or image masking, and then, adapted through fine-tuning for downstream user applications including instruction following and Question Answering. While many approaches have been developed for model fine-tuning including low-rank weight update strategies (eg. LoRA), underlying mathematical principles that enable network adaptation without knowledge loss remain poorly understood. Here, we introduce a differential geometry framework, functionally invariant paths (FIP), that provides flexible and continuous adaptation of neural networks for a range of machine learning goals and network sparsification objectives. We conceptualize the weight space of a neural network as a curved Riemannian manifold equipped with a metric tensor whose spectrum defines low rank subspaces in weight space that accommodate network adaptation without loss of prior knowledge. We formalize adaptation as movement along a geodesic path in weight space while searching for networks that accommodate secondary objectives. With modest computational resources, the FIP algorithm achieves comparable to state of the art performance on continual learning and sparsification tasks for language models (BERT), vision transformers (ViT, DeIT), and the CNNs. Broadly, we conceptualize a neural network as a mathematical object that can be iteratively transformed into distinct configurations by the path-sampling algorithm to define a sub-manifold of weight space that can be harnessed to achieve user goals.
A Fully First-Order Method for Stochastic Bilevel Optimization
Authors: Jeongyeol Kwon, Dohyun Kwon, Stephen Wright, Robert Nowak
We consider stochastic unconstrained bilevel optimization problems when only the first-order gradient oracles are available. While numerous optimization methods have been proposed for tackling bilevel problems, existing methods either tend to require possibly expensive calculations regarding Hessians of lower-level objectives, or lack rigorous finite-time performance guarantees. In this work, we propose a Fully First-order Stochastic Approximation (F2SA) method, and study its non-asymptotic convergence properties. Specifically, we show that F2SA converges to an $\epsilon$-stationary solution of the bilevel problem after $\epsilon^{-7/2}, \epsilon^{-5/2}$, and $\epsilon^{-3/2}$ iterations (each iteration using $O(1)$ samples) when stochastic noises are in both level objectives, only in the upper-level objective, and not present (deterministic settings), respectively. We further show that if we employ momentum-assisted gradient estimators, the iteration complexities can be improved to $\epsilon^{-5/2}, \epsilon^{-4/2}$, and $\epsilon^{-3/2}$, respectively. We demonstrate even superior practical performance of the proposed method over existing second-order based approaches on MNIST data-hypercleaning experiments.
Hypocoercivity and exponential time decay for the linear inhomogeneous relaxation Boltzmann equation
Authors: Frederic Herau
We consider an inhomogeneous linear Boltzmann equation, with an external confining potential. The collision operator is a simple relaxation toward a local Maxwellian, therefore without diffusion. We prove the exponential time decay toward the global Maxwellian, with an explicit rate of decay. The methods are based on hypoelliptic methods transposed here to get spectral information. They were inspired by former works on the Fokker-Planck equation and the main feature of this work is that they are relevant although the equation itself has no regularizing properties.
Counterexamples to conjectures by Gross, Mansour and Tucker on partial-dual genus polynomials of ribbon graphs
Authors: Qi Yan, Xian'an Jin
Gross, Mansour and Tucker introduced the partial-dual orientable genus polynomial and the partial-dual Euler genus polynomial. They computed these two partial-dual genus polynomials of four families of ribbon graphs, posed some research problems and made some conjectures. In this paper, we introduce the notion of signed sequences of bouquets and obtain the partial-dual Euler genus polynomials for all ribbon graphs with the number of edges less than 4 and the partial-dual orientable genus polynomials for all orientable ribbon graphs with the number of edges less than 5 in terms of signed sequences. We check all the conjectures and find a counterexample to the Conjecture 3.1 in their paper: There is no orientable ribbon graph having a non-constant partial-dual genus polynomial with only one non-zero coefficient. Motivated by this counterexample, we further find an infinite family of counterexamples to the conjecture. Moreover, we find a counterexample to the Conjecture 5.3 in their paper: The partial-dual Euler-genus polynomial for any non-orientable ribbon graph is interpolating.
Llama 2: Open Foundation and Fine-Tuned Chat Models
Authors: Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, Dan Bikel, Lukas Blecher, Cristian Canton Ferrer, Moya Chen, Guillem Cucurull, David Esiobu, Jude Fernandes, Jeremy Fu, Wenyin Fu, Brian Fuller, Cynthia Gao, Vedanuj Goswami, Naman Goyal, Anthony Hartshorn, Saghar Hosseini, Rui Hou, Hakan Inan, Marcin Kardas, Viktor Kerkez, Madian Khabsa, Isabel Kloumann, Artem Korenev, Punit Singh Koura, Marie-Anne Lachaux, Thibaut Lavril, Jenya Lee, Diana Liskovich, Yinghai Lu, Yuning Mao, Xavier Martinet, Todor Mihaylov, Pushkar Mishra, Igor Molybog, Yixin Nie, Andrew Poulton, Jeremy Reizenstein, Rashi Rungta, Kalyan Saladi, Alan Schelten, Ruan Silva, Eric Michael Smith, Ranjan Subramanian, Xiaoqing Ellen Tan, Binh Tang, Ross Taylor, Adina Williams, Jian Xiang Kuan, Puxin Xu, Zheng Yan, Iliyan Zarov, Yuchen Zhang, Angela Fan, Melanie Kambadur, Sharan Narang, Aurelien Rodriguez, Robert Stojnic, Sergey Edunov, Thomas Scialom
In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases. Our models outperform open-source chat models on most benchmarks we tested, and based on our human evaluations for helpfulness and safety, may be a suitable substitute for closed-source models. We provide a detailed description of our approach to fine-tuning and safety improvements of Llama 2-Chat in order to enable the community to build on our work and contribute to the responsible development of LLMs.
Smooth invariant foliations and Koopman eigenfunctions about stable equilibria of semiflows
Authors: Gergely Buza
We consider a $C^r$ semiflow $\{ \varphi_t \}_{t \geq 0}$ on a Banach space $X$ admitting a stable fixed point $x$. We show, along the lines of the parameterization method (Cabr\'e et al., 2003), the existence of a $C^r$ invariant foliation tangent to $X_1$ at $x$, for an arbitrary $D \varphi_t(x)$-invariant subspace $X_1 \subset X$ satisfying some additional spectral conditions. Uniqueness ensues in a subclass of sufficiently smooth invariant foliations tangent to $X_1$ at $x$. We then draw relations to Koopman theory, and thereby establish the existence and uniqueness, in some appropriate sense, of $C^r$ Koopman eigenfunctions. We demonstrate that these results apply to the case of the Navier-Stokes system, the archetypal example considered by the modern upheaval of applied 'Koopmanism'.
Existence of KPP fronts in spatially-temporally periodic advection and variational principle for propagation speeds
Authors: James Nolen, Matthew Rudd, Jack Xin
We prove the existence of Kolmogorov-Petrovsky-Piskunov (KPP) type traveling fronts in space-time periodic and mean zero incompressible advection, and establish a variational (minimization) formula for the minimal speeds. We approach the existence by considering limit of a sequence of front solutions to a regularized traveling front equation where the nonlinearity is combustion type with ignition cut-off. The limiting front equation is degenerate parabolic and does not permit strong solutions, however, the necessary compactness follows from monotonicity of fronts and degenerate regularity. We apply a dynamic argument to justify that the constructed KPP traveling fronts propagate at minimal speeds, and derive the speed variational formula. The dynamic method avoids the degeneracy in traveling front equations, and utilizes the parabolic maximum principle of the governing reaction-diffusion-advection equation. The dynamic method does not rely on existence of traveling fronts.