Computational complexity, data structures, algorithms, and game theory
2602.17577Omniprediction is a learning problem that requires suboptimality bounds for each of a family of losses $\mathcal{L}$ against a family of comparator predictors $\mathcal{C}$. We initiate the study of omniprediction in a multiclass setting, where the comparator family $\mathcal{C}$ may be infinite. Our main result is an extension of the recent binary omniprediction algorithm of [OKK25] to the multiclass setting, with sample complexity (in statistical settings) or regret horizon (in online settings) $\approx \varepsilon^{-(k+1)}$, for $\varepsilon$-omniprediction in a $k$-class prediction problem. En route to proving this result, we design a framework of potential broader interest for solving Blackwell approachability problems where multiple sets must simultaneously be approached via coupled actions.
We present an algorithm turning any term of a linear quantum $λ$-calculus into a quantum circuit. The essential ingredient behind the proposed algorithm is Girard's geometry of interaction, which, differently from its well-known uses from the literature, is here leveraged to perform as much of the classical computation as possible, at the same time producing a circuit that, when evaluated, performs all the quantum operations in the underlying $λ$-term. We identify higher-order control flow as the primary obstacle towards efficient solutions to the problem at hand. Notably, geometry of interaction proves sufficiently flexible to enable efficient compilation in many cases, while still supporting a total compilation procedure. Finally, we characterize through a type system those $λ$-terms for which compilation can be performed efficiently.
2602.17480In recent years, Homotopy Type Theory (HoTT) has had great success both as a foundation of mathematics and as internal language to reason about $\infty$-groupoids (a.k.a. spaces). However, in many areas of mathematics and computer science, it is often the case that it is categories, not groupoids, which are the more important structures to consider. For this reason, multiple directed type theories have been proposed, i.e., theories whose semantics are based on categories. In this paper, we present a new such type theory, Twisted Type Theory (TTT). It features a novel ``twisting'' operation on types: given a type that depends both contravariantly and covariantly on some variables, its twist is a new type that depends only covariantly on the same variables. To provide the semantics of this operation, we introduce the notion of dependent 2-sided fibrations (D2SFibs), which generalize Street's notion of 2-sided fibrations. We develop the basic theory of D2SFibs, as well as characterize them through a straightening-unstraightening theorem. With these results in hand, we introduce a new elimination rule for Hom-types. We argue that our syntax and semantics satisfy key features that allow reasoning in a HoTT-like style, which allows us to mimic the proof techniques of that setting. We end the paper by exemplifying this, and use TTT to reason about categories, giving a syntactic proof of Yoneda's lemma.
Prophet inequalities compare online stopping strategies against an omniscient "prophet" using distributional knowledge. In this work, we augment this model with a conservative prediction of the maximum realized value. We quantify the quality of this prediction using a parameter $α\in [0,1]$, ranging from inaccurate to perfect. Our goal is to improve performance when predictions are accurate (consistency) while maintaining theoretical guarantees when they are not (robustness). We propose a threshold-based strategy oblivious to $α$ (i.e., with $α$ unknown to the algorithm) that matches the classic competitive ratio of $1/2$ at $α=0$ and improves smoothly to $3/4$ at $α=1$. We further prove that simultaneously achieving better than $3/4$ at $α=1$ while maintaining $1/2$ at $α=0$ is impossible. Finally, when $α$ is known in advance, we present a strategy achieving a tight competitive ratio of $\frac{1}{2-α}$.
2602.17309A prefix code L satisfies the condition that no word of L is a proper prefix of another word of L. Recently, Ko, Han and Salomaa relaxed this condition by allowing a word of L to be a proper prefix of at most k words of L, for some `margin' k, introducing thus the class of k-prefix-free languages, as well as the similar classes of k-suffix-free and k-infix-free languages. Here we unify the definitions of these three classes of languages into one uniform definition in two ways: via the method of partial orders and via the method of transducers. Thus, for any known class of code-related languages definable via the transducer method, one gets a marginal version of that class. Building on the techniques of Ko, Han and Salomaa, we discuss the \emph{uniform} satisfaction and maximality problems for marginal classes of languages.
2602.17242We introduce \emph{TAPO-Structured Description Logic} (TAPO--DL), a formal extension of classical description logic designed to model \emph{information behavior} as a structured, dynamic process. TAPO--DL extends the standard T--Box/A--Box architecture with two additional layers: a \emph{Procedural Box} (P--Box), which supports concept-driven, imperative-style programs such as conditional and iterative actions, and an \emph{Oracle Box} (O--Box), which formalizes controlled interaction with external information sources. While the terminological and assertional components capture static conceptual and factual knowledge, the procedural and oracle-based components enable the explicit representation of information-generating actions and external validation. We provide a unified semantic framework for TAPO--DL based on a co-generative, sheaf-theoretic interpretation, in which local informational states are modeled as sections and informational stability corresponds to the existence of coherent global structures. Within this setting, informational truth is characterized as stability under repeated agentive interaction rather than correspondence to a fixed global state. By integrating description logic with procedural dynamics, oracle-based reasoning, and sheaf-theoretic semantics, TAPO--DL offers a principled formal framework for analyzing information behavior in contexts involving interaction, uncertainty, and contextuality.
Compressed suffix arrays (CSAs) index large repetitive collections and are key in many text applications. The r-index and its derivatives combine the run-length Burrows-Wheeler Transform (BWT) with suffix array sampling to achieve space proportional to the number of equal-symbol runs in the BWT. While effective for near-identical strings, their size grows quickly as variation increases, since the number of BWT runs is sensitive to edits. Existing approaches typically trade space for query speed, or vice versa, limiting their practicality at large scale. We introduce variable-length blocking (VLB), an encoding technique for BWT-based CSAs that adapts the amount of indexing information to local compressibility. The BWT is recursively divided into blocks of at most w runs (a parameter) and organized into a tree. Compressible regions appear near the root and store little auxiliary data, while incompressible regions lie deeper and retain additional information to speed up access. Queries traverse a short root-to-leaf path followed by a small run scan. This strategy balances space and query speed by transferring bits saved in compressible areas to accelerate access in incompressible ones. Backward search relies on rank and successor queries over the BWT. We introduce a sampling technique that guarantees correctness only along valid backward-search states, reducing space without affecting query performance. We extend VLB to encode the subsampled r-index (sr-index). Experiments show that VLB-based techniques outperform the r-index and sr-index in query time, while retaining space close to that of the sr-index. Compared to the move data structure, VLB offers a more favorable space-time tradeoff.
2602.17142Abstract interpretation has been shown to be a promising technique for the thread-modular verification of concurrent programs. Central to this is the generation of interferences, in the form of rely-guarantee conditions, conforming to a user-chosen structure. In this work, we introduce one such structure called the conditional-writes domain, designed for programs where it suffices to establish only the conditions under which particular variables are written to by each thread. We formalise our analysis within a novel abstract interpretation framework that is highly modular and can be easily extended to capture other structures for rely-guarantee conditions. We formalise two versions of our approach and evaluate their implementations on a simple programming language.
We study preference learning through recommendations in multi-agent game settings, where a moderator repeatedly interacts with agents whose utility functions are unknown. In each round, the moderator issues action recommendations and observes whether agents follow or deviate from them. We consider two canonical behavioral feedback models-best response and quantal response-and study how the information revealed by each model affects the learnability of agents' utilities. We show that under quantal-response feedback the game is learnable, up to a positive affine equivalence class, with logarithmic sample complexity in the desired precision, whereas best-response feedback can only identify a larger set of agents' utilities. We give a complete geometric characterization of this set. Moreover, we introduce a regret notion based on agents' incentives to deviate from recommendations and design an online algorithm with low regret under both feedback models, with bounds scaling linearly in the game dimension and logarithmically in time. Our results lay a theoretical foundation for AI recommendation systems in strategic multi-agent environments, where recommendation compliances are shaped by strategic interaction.
Much of the advancement of Multi-Agent Reinforcement Learning (MARL) in imperfect-information games has historically depended on manual iterative refinement of baselines. While foundational families like Counterfactual Regret Minimization (CFR) and Policy Space Response Oracles (PSRO) rest on solid theoretical ground, the design of their most effective variants often relies on human intuition to navigate a vast algorithmic design space. In this work, we propose the use of AlphaEvolve, an evolutionary coding agent powered by large language models, to automatically discover new multiagent learning algorithms. We demonstrate the generality of this framework by evolving novel variants for two distinct paradigms of game-theoretic learning. First, in the domain of iterative regret minimization, we evolve the logic governing regret accumulation and policy derivation, discovering a new algorithm, Volatility-Adaptive Discounted (VAD-)CFR. VAD-CFR employs novel, non-intuitive mechanisms-including volatility-sensitive discounting, consistency-enforced optimism, and a hard warm-start policy accumulation schedule-to outperform state-of-the-art baselines like Discounted Predictive CFR+. Second, in the regime of population based training algorithms, we evolve training-time and evaluation-time meta strategy solvers for PSRO, discovering a new variant, Smoothed Hybrid Optimistic Regret (SHOR-)PSRO. SHOR-PSRO introduces a hybrid meta-solver that linearly blends Optimistic Regret Matching with a smoothed, temperature-controlled distribution over best pure strategies. By dynamically annealing this blending factor and diversity bonuses during training, the algorithm automates the transition from population diversity to rigorous equilibrium finding, yielding superior empirical convergence compared to standard static meta-solvers.
We study a setting in which a data buyer seeks to estimate an unknown parameter by purchasing samples from one of K data sellers. Each seller has privately known data quality (e.g., high vs. low variance) and a private per-sample cost. We consider a multi-stage game in which the first stage is a free-trial stage in which the sellers have the option of signaling data quality by offering a few samples of data for free. Buyers update their beliefs based on the sample variance of the free data and then run a procurement auction to buy data in a second stage. For the auction stage, we characterize an approximately optimal Bayesian incentive compatible mechanism: the buyer selects a single seller by minimizing a belief-adjusted virtual cost and chooses the purchased sample size as a function of posterior quality and virtual cost. For the free-trial stage, we characterize the equilibrium, taking the above mechanism as the continuation game. Free trials may fail to emerge: for some parameters, all sellers reveal zero samples. However, under sufficiently strong competition (large K), there is an equilibrium in which sellers reveal the maximum allowable number of samples; in fact, it is the unique equilibrium.
2602.16880Uniform interpolation is a strong form of interpolation providing an interpretation of propositional quantifiers within a propositional logic. Pitts' seminal work establishes this property for intuitionistic propositional logic relying on a sequent calculus in which naïve backward proof-search terminates. This constructive approach has been adapted to a wide range of logics, including intuitionistic modal logics. Surprisingly, no intuitionistic modal logic with independent box and diamond has yet been shown to satisfy uniform interpolation. We fill in this gap by proving the uniform interpolation property for Constructive K (CK) and Wijesekera's K (WK). We build on Pitts' technique by exploiting existing terminating calculi for CK and WK, which we prove to eliminate cut, and formalise all our results in the proof assistant Rocq. Together, our results constitute the first positive uniform interpolation results for intuitionistic modal logics with diamond.
2602.16867In this paper, we study the {\em green bin packing} (GBP) problem where $β\ge 0$ and $G \in [0, 1]$ are two given values as part of the input. The energy consumed by a bin is $\max \{0, β(x-G) \}$ where $x$ is the total size of the items packed into the bin. The GBP aims to pack all items into a set of unit-capacity bins so that the number of bins used plus the total energy consumption is minimized. When $β= 0$ or $G = 1$, GBP is reduced to the classic bin packing (BP) problem. In the {\em constrained green bin packing} (CGBP) problem, the objective is to minimize the number of bins used to pack all items while the total energy consumption does not exceed a given upper bound $U$. We present an APTAS and a $\frac 32$-approximation algorithm for both GBP and CGBP, where the ratio $\frac 32$ matches the lower bound of BP. Keywords: Green bin packing; constrained green bin packing; approximation scheme; offline algorithms
Unlike in TFNP, for which there is an abundance of problems capturing natural existence principles which are incomparable (in the black-box setting), Kleinberg et al. [KKMP21] observed that many of the natural problems considered so far in the second level of the total function polynomial hierarchy (TF$Σ_2$) reduce to the Strong Avoid problem. In this work, we prove that the Linear Ordering Principle does not reduce to Strong Avoid in the black-box setting, exhibiting the first TF$Σ_2$ problem that lies outside of the class of problems reducible to Strong Avoid. The proof of our separation exploits a connection between total search problems in the polynomial hierarchy and proof complexity, recently developed by Fleming, Imrek, and Marciot [FIM25]. In particular, this implies that to show our separation, it suffices to show that there is no small proof of the Linear Ordering Principle in a $Σ_2$-variant of the Sherali-Adams proof system. To do so, we extend the classical pseudo-expectation method to the $Σ_2$ setting, showing that the existence of a $Σ_2$ pseudo-expectation precludes a $Σ_2$ Sherali-Adams proof. The main technical challenge is in proving the existence of such a pseudo-expectation, we manage to do so by solving a combinatorial covering problem about permutations. We also show that the extended pseudo-expectation bound implies that the Linear Ordering Principle cannot be reduced to any problem admitting a low-degree Sherali-Adams refutation.
2602.16688In this work, we study the hardness of approximation of the fair $k$-center problem. Here the data points are partitioned into groups and the task is to choose a prescribed number of data points from each group, called centers, while minimizing the maximum distance from any point to its closest center. Although a polynomial-time $3$-approximation is known for this problem in general metrics, it has remained open whether this approximation guarantee is tight or could be further improved, especially since the unconstrained $k$-center problem admits a polynomial-time factor-$2$ approximation. We resolve this open question by proving that, for every $ε>0$, achieving a $(3-ε)$-approximation is NP-hard, assuming $\text{P} \neq \text{NP}$. Our inapproximability results hold even when only two disjoint groups are present and at least one center must be chosen from each group. Further, it extends to the canonical one-per-group setting with $k$-groups (for arbitrary $k$), where exactly one center must be selected from each group. Consequently, the factor-$3$ barrier for fair $k$-center in general metric spaces is inherent, and existing $3$-approximation algorithms are optimal up to lower-order terms even in these restricted regimes. This result stands in sharp contrast to the $k$-supplier formulation, where both the unconstrained and fair variants admit factor-$3$ approximation in polynomial time.
We present a randomized algorithm for the single-source shortest paths (SSSP) problem on directed graphs with arbitrary real-valued edge weights that runs in $n^{2+o(1)}$ time with high probability. This result yields the first almost linear-time algorithm for the problem on dense graphs ($m = Θ(n^2)$) and improves upon the best previously known bounds for moderately dense graphs ($m = ω(n^{1.306})$). Our approach builds on the hop-reduction via shortcutting framework introduced by Li, Li, Rao, and Zhang (2025), which iteratively augments the graph with shortcut edges to reduce the negative hop count of shortest paths. The central computational bottleneck in prior work is the cost of explicitly constructing these shortcuts in dense regions. We overcome this by introducing a new compression technique using auxiliary Steiner vertices. Specifically, we construct these vertices to represent large neighborhoods compactly in a structured manner, allowing us to efficiently generate and propagate shortcuts while strictly controlling the growth of vertex degrees and graph size.
2602.16612Abstracting from a low level to a more explanatory high level of description, and ideally while preserving causal structure, is fundamental to scientific practice, to causal inference problems, and to robust, efficient and interpretable AI. We present a general account of abstractions between low and high level models as natural transformations, focusing on the case of causal models. This provides a new formalisation of causal abstraction, unifying several notions in the literature, including constructive causal abstraction, Q-$τ$ consistency, abstractions based on interchange interventions, and `distributed' causal abstractions. Our approach is formalised in terms of category theory, and uses the general notion of a compositional model with a given set of queries and semantics in a monoidal, cd- or Markov category; causal models and their queries such as interventions being special cases. We identify two basic notions of abstraction: downward abstractions mapping queries from high to low level; and upward abstractions, mapping concrete queries such as Do-interventions from low to high. Although usually presented as the latter, we show how common causal abstractions may, more fundamentally, be understood in terms of the former. Our approach also leads us to consider a new stronger notion of `component-level' abstraction, applying to the individual components of a model. In particular, this yields a novel, strengthened form of constructive causal abstraction at the mechanism-level, for which we prove characterisation results. Finally, we show that abstraction can be generalised to further compositional models, including those with a quantum semantics implemented by quantum circuits, and we take first steps in exploring abstractions between quantum compositional circuit models and high-level classical causal models as a means to explainable quantum AI.
A signed tree model of a graph $G$ is a compact binary structure consisting of a rooted binary tree whose leaves are bijectively mapped to the vertices of $G$, together with 2-colored edges $xy$, called transversal pairs, interpreted as bicliques or anti-bicliques whose sides are the leaves of the subtrees rooted at $x$ and at $y$. We design an algorithm that, given such a representation of an $n$-vertex graph $G$ with $p$ transversal pairs and a source $v \in V(G)$, computes a shortest-path tree rooted at $v$ in $G$ in time $O(p \log n)$. A wide variety of graph classes are such that for all $n$, their $n$-vertex graphs admit signed tree models with $O(n)$ transversal pairs: for instance, those of bounded symmetric difference, more generally of bounded sd-degeneracy, as well as interval graphs. As applications of our Single-Source Shortest Path algorithm and new techniques, we - improve the runtime of the fixed-parameter algorithm for first-order model checking on graphs given with a witness of low merge-width from cubic [Dreier and Toruńczyk, STOC '25] to quadratic; - give an $O(n^2 \log n)$-time algorithm for All-Pairs Shortest Path (APSP) on graphs given with a witness of low merge-width, generalizing a result known on twin-width [Twin-Width III, SICOMP '24]; - extend and simplify an $O(n^2 \log n)$-time algorithm for multiplying two $n \times n$ matrices $A, B$ of bounded twin-width in [Twin-Width V, STACS '23]: now $A$ solely has to be an adjacency matrix of a graph of bounded twin-width and $B$ can be arbitrary; - give an $O(n^2 \log^2 n)$-time algorithm for APSP on graphs of bounded twin-width, bypassing the need for contraction sequences in [Twin-Width III, SICOMP '24; Bannach et al. STACS '24]; - give an $O(n^{7/3} \log^2 n)$-time algorithm for APSP on graphs of symmetric difference $O(n^{1/3})$.
We introduce MerLean, a fully automated agentic framework for autoformalization in quantum computation. MerLean extracts mathematical statements from \LaTeX{} source files, formalizes them into verified Lean~4 code built on Mathlib, and translates the result back into human-readable \LaTeX{} for semantic review. We evaluate MerLean on three theoretical quantum computing papers producing 2,050 Lean declarations from 114 statements in total. MerLean achieves end-to-end formalization on all three papers, reducing the verification burden to only the newly introduced definitions and axioms. Our results demonstrate that agentic autoformalization can scale to frontier research, offering both a practical tool for machine-verified peer review and a scalable engine for mining high-quality synthetic data to train future reasoning models. Our approach can also be generalized to any other rigorous research in mathematics and theoretical physics.
2602.16532Determining if an input undirected graph is Hamiltonian, i.e., if it has a cycle that visits every vertex exactly once, is one of the most famous NP-complete problems. We consider the following generalization of Hamiltonian cycles: for a fixed set $S$ of natural numbers, we want to visit each vertex of a graph $G$ exactly once and ensure that any two consecutive vertices can be joined in $k$ hops for some choice of $k \in S$. Formally, an $S$-Hamiltonian cycle is a permutation $(v_0,\ldots,v_{n-1})$ of the vertices of $G$ such that, for $0 \leq i \leq n-1$, there exists a walk between $v_i$ and $v_{i+1 \bmod n}$ whose length is in $S$. (We do not impose any constraints on how many times vertices can be visited as intermediate vertices of walks.) Of course Hamiltonian cycles in the standard sense correspond to $S=\{1\}$. We study the $S$-Hamiltonian cycle problem of deciding whether an input graph $G$ has an $S$-Hamiltonian cycle. Our goal is to determine the complexity of this problem depending on the fixed set $S$. It is already known that the problem remains NP-complete for $S=\{1,2\}$, whereas it is trivial for $S=\{1,2,3\}$ because any connected graph contains a $\{1,2,3\}$-Hamiltonian cycle. Our work classifies the complexity of this problem for most kinds of sets $S$, with the key new results being the following: we have NP-completeness for $S = \{2\}$ and for $S = \{2, 4\}$, but tractability for $S = \{1, 2, 4\}$, for $S = \{2, 4, 6\}$, for any superset of these two tractable cases, and for $S$ the infinite set of all odd integers. The remaining open cases are the non-singleton finite sets of odd integers, in particular $S = \{1, 3\}$. Beyond cycles, we also discuss the complexity of finding $S$-Hamiltonian paths, and show that our problems are all tractable on graphs of bounded cliquewidth.