Rendering, simulation, computational geometry, and visual effects
In the Rectangle Stabbing problem, input is a set ${\cal R}$ of axis-parallel rectangles and a set ${\cal L}$ of axis parallel lines in the plane. The task is to find a minimum size set ${\cal L}^* \subseteq {\cal L}$ such that for every rectangle $R \in {\cal R}$ there is a line $\ell \in {\cal L}^*$ such that $\ell$ intersects $R$. Gaur et al. [Journal of Algorithms, 2002] gave a polynomial time $2$-approximation algorithm, while Dom et al. [WALCOM 2009] and Giannopolous et al. [EuroCG 2009] independently showed that, assuming FPT $\neq$ W[1], there is no algorithm with running time $f(k)(|{\cal L}||{\cal R}|)^{O(1)}$ that determines whether there exists an optimal solution with at most $k$ lines. We give the first parameterized approximation algorithm for the problem with a ratio better than $2$. In particular we give an algorithm that given ${\cal R}$, ${\cal L}$, and an integer $k$ runs in time $k^{O(k)}(|{\cal L}||{\cal R}|)^{O(1)}$ and either correctly concludes that there does not exist a solution with at most $k$ lines, or produces a solution with at most $\frac{7k}{4}$ lines. We complement our algorithm by showing that unless FPT $=$ W[1], the Rectangle Stabbing problem does not admit a $(\frac{5}{4}-ε)$-approximation algorithm running in $f(k)(|{\cal L}||{\cal R}|)^{O(1)}$ time for any function $f$ and $ε> 0$.
2604.04244Physics-based simulation involves trade-offs between performance and accuracy. In collision detection, one trade-off is the granularity of collider geometry. Primitive-based colliders such as bounding boxes are efficient, while using the original mesh is more accurate but often computationally expensive. Approximate Convex Decomposition (ACD) methods strive for a balance of efficiency and accuracy. Prior works can produce high-quality decompositions but require large numbers of convex parts and are sensitive to the orientation of the input mesh. We address these weaknesses with VisACD, a visibility-based, rotation-equivariant, and intersection-free ACD algorithm with GPU acceleration. Our approach produces high-quality decompositions with fewer convex parts, is not sensitive to shape orientation, and is more efficient than prior work.
A unique sink orientation (USO) is an orientation of the edges of a polytope in which every face contains a unique sink. For a product of simplices $Δ_{m-1} \times Δ_{n-1}$, Felsner, Gärtner and Tschirschnitz (2005) characterize USOs which are induced by linear functions as the USOs on a $(m \times n)$-grid that correspond to a two-colored arrangement of lines. We generalize some of their results to products $Δ^1 \times\cdots\times Δ^r$ of $r$ simplices, USOs on $r$-dimensional grids and $(r+1)$-signotopes.
2604.04011We provide a simple algorithm for computing a balanced separator for a set of segments that is $c$-packed, showing that the separator cuts only $O(c)$ segments. While the result was known before, arguably our proof is simpler.
2604.03889We present a method for generating orthogonal quadrilateral meshes subject to user-defined feature alignment and sizing constraints. The approach relies on computing integrable orthogonal frame fields, whose symmetries are implicitly represented using orthogonally decomposable (odeco) tensors. We extend the existing 2D odeco integrability formulation to the 3D setting, and define the useful energies in a finite element approach. Our frame fields are shear-free (orthogonal) by construction, and we provide terms to minimize area and/or stretch distortion. The optimization naturally creates and places singularities to achieve integrability, obviating the need for user placement or greedy iterative methods. We validate the method on both smooth surfaces and feature-rich CAD models. Compared to previous works on integrable frame fields, we offer better performance in the presence of mesh sizing constraints and achieve lower distortion metrics.
Participating media are a pervasive and intriguing visual effect in virtual environments. Unfortunately, rendering such phenomena in real-time is notoriously difficult due to the computational expense of estimating the volume rendering equation. While the six-way lightmaps technique has been widely used in video games to render smoke with a camera-oriented billboard and approximate lighting effects using six precomputed lightmaps, achieving a balance between realism and efficiency, it is limited to pre-simulated animation sequences and is ignorant of camera movement. In this work, we propose a neural six-way lightmaps method to strike a long-sought balance between dynamics and visual realism. Our approach first generates a guiding map from the camera view using ray marching with a large sampling distance to approximate smoke scattering and silhouette. Then, given a guiding map, we train a neural network to predict the corresponding six-way lightmaps. The resulting lightmaps can be seamlessly used in existing game engine pipelines. This approach supports visually appealing rendering effects while enabling real-time user interactivity, including smoke-obstacle interaction, camera movement, and light change. By conducting a series of comprehensive benchmarks, we demonstrate that our method is well-suited for real-time applications, such as games and VR/AR.
With recent advances in frontier multimodal large language models (MLLMs) for data understanding and visual reasoning, the role of LLMs has evolved from passive LLM-as-an-interface to proactive LLM-as-a-judge, enabling deeper integration into the scientific data analysis and visualization pipelines. However, existing scientific visualization agents still rely on domain experts to provide prior knowledge for specific datasets or visualization-oriented objective functions to guide the workflow through iterative feedback. This reactive, data-dependent, human-in-the-loop (HITL) paradigm is time-consuming and does not scale effectively to large-scale scientific data. In this work, we propose a Self-Directed Agent for Scientific Analysis and Visualization (SASAV), the first fully autonomous AI agent to perform scientific data analysis and generate insightful visualizations without any external prompting or HITL feedback. SASAV is a multi-agent system that automatically orchestrates data exploration workflows through our proposed components, including automated data profiling, context-aware knowledge retrieval, and reasoning-driven visualization parameter exploration, while supporting downstream interactive visualization tasks. This work establishes a foundational building block for the future AI for Science to accelerate scientific discovery and innovation at scale.
Parametric boundary representation models (B-Reps) are the de facto standard in CAD, graphics, and robotics, yet converting them into valid meshes remains fragile. The difficulty originates from the unavoidable approximation of high-order surface and curve intersections to low-order primitives: the resulting geometric realization often fails to respect the exact topology encoded in the B-Rep, producing meshes with incorrect or missing adjacencies. Existing meshing pipelines address these inconsistencies through heuristic feature-merging and repair strategies that offer no topological guarantees and frequently fail on complex models. We propose a fundamentally different approach: the B-Rep topology is treated as an invariant of the meshing process. Our algorithm enforces the exact B-Rep topology while allowing a single user-defined tolerance to control the deviation of the mesh from the underlying parametric surfaces. Consequently, for any admissible tolerance, the output mesh is topologically correct; only its geometric fidelity degrades as the tolerance increases. This decoupling eliminates the need for post-hoc repairs and yields robust meshes even when the underlying geometry is inconsistent or highly approximated. We evaluate our method on thousands of real-world CAD models from the ABC and Fusion 360 repositories, including instances that fail with standard meshing tools. The results demonstrate that topological guarantees at the algorithmic level enable reliable mesh generation suitable for downstream applications.
Professional color editing requires precise control over both color (hue and saturation) and lightness, ideally through separate, independent controls. We present a real-time interactive color editing framework for 3D Gaussian Splatting (3DGS) that enables palette-based recoloring, per-palette tone curves for color-aware lightness adjustment, and accurate pixel-level constraints -- capabilities unavailable in prior palette-based 3DGS methods. Existing approaches decompose colors at the primitive level, optimizing per-Gaussian palette weights before splatting. However, sparse primitive-level weights do not guarantee sparse pixel-level decompositions after alpha-blending, causing palette edits to affect unintended regions and degrading editing quality. We address this through view-space palette decomposition, splatting weights instead of colors to optimize the observable appearance of the scene. We introduce a geometric loss using inverse barycentric coordinates to enforce consistent sparsity patterns, ensuring similar colors share similar decompositions. Our approach achieves superior editing quality compared to primitive-space methods, enabling professional color grading workflows for 3DGS scenes with real-time interaction.
Estimating correspondences between deformed shape instances is a long-standing problem in computer graphics; numerous applications, from texture transfer to statistical modelling, rely on recovering an accurate correspondence map. Many methods have thus been proposed to tackle this challenging problem from varying perspectives, depending on the downstream application. This state-of-the-art report is geared towards researchers, practitioners, and students seeking to understand recent trends and advances in the field. We categorise developments into three paradigms: spectral methods based on functional maps, combinatorial formulations that impose discrete constraints, and deformation-based methods that directly recover a global alignment. Each school of thought offers different advantages and disadvantages, which we discuss throughout the report. Meanwhile, we highlight the latest developments in each area and suggest new potential research directions. Finally, we provide an overview of emerging challenges and opportunities in this growing field, including the recent use of vision foundation models for zero-shot correspondence and the particularly challenging task of matching partial shapes.
Gaussian Splatting is a powerful tool for reconstructing diffuse scenes, but it struggles to simultaneously model specular reflections and the appearance of objects behind semi-transparent surfaces. These specular reflections and transmittance are essential for realistic novel view synthesis, and existing methods do not properly incorporate the underlying physical processes to simulate them. To address this issue, we propose RT-GS, a unified framework that integrates a microfacet material model and ray tracing to jointly model specular reflection and transmittance in Gaussian Splatting. We accomplish this by using separate Gaussian primitives for reflections and transmittance, which allow modeling distant reflections and reconstructing objects behind transparent surfaces concurrently. We utilize a differentiable ray tracing framework to obtain the specular reflection and transmittance appearance. Our experiments demonstrate that our method successfully produces reflections and recovers objects behind transparent surfaces in complex environments, achieving significant qualitative improvements over prior methods where these specular light interactions are prominent.
We present a new fully dynamic algorithm for maintaining convex hulls under insertions and deletions while supporting geometric queries. Our approach combines the logarithmic method with a deletion-only convex hull data structure, achieving amortised update times of $O(\log n \log \log n)$ and query times of $O(\log^2 n)$. We provide a robust and non-trivial implementation that supports point-location queries, a challenging and non-decomposable class of convex hull queries. We evaluate our implementation against the state of the art, including a new naive baseline that rebuilds the convex hull whenever an update affects it. On hulls that include polynomially many data points (e.g. $Θ(n^\varepsilon)$ for some $\varepsilon$), such as the ones that often occur in practice, our method outperforms all other techniques. Update-heavy workloads strongly favour our approach, which is in line with our theoretical guarantees. Yet, our method remains competitive all the way down to when the update to query ratio is $1$ to $10$. Experiments on real-world data sets furthermore reveal that existing fully dynamic techniques suffer from significant robustness issues. In contrast, our implementation remains stable across all tested inputs.
Orthogonal graph layout algorithms aim to produce clear, compact, and readable network diagrams by arranging nodes and edges along horizontal and vertical lines, while minimizing bends and crossings. Most existing orthogonal layout methods focus primarily on quality criteria such as area usage, total edge length, and bend minimization. Explicitly controlling the global aspect ratio (AR) of the resulting layout is as of now unexplored. Existing orthogonal layout methods offer no control over the resulting AR and their rigid geometric constraints make adaptation of finished layouts difficult. With the increasing variety of aspect ratios encountered in daily life, from wide monitors to tall mobile devices or fixed-size interface panels, there is a clear need for aspect ratio control in orthogonal layout methods. To tackle this issue, we introduce Aspect Ratio-Constrained Orthogonal Layout (ARCOL). Building upon the Human-like Orthogonal Layout Algorithm (HOLA)~\cite{Kieffer2016}, we integrate aspect ratio at two different stages: (1) into the stress minimization phase, as a soft constraint, allowing the layout algorithm to gently guide node positions toward a specified target AR, while preserving visual clarity and topological faithfulness; and (2) into the tree reattachment phase, where we modify the cost function to favor placements that improve the AR. We evaluate our approach through quantitative evaluation and a user study, as well as expert interviews. Our evaluations show that ARCOL produces balanced and space efficient orthogonal layouts across diverse aspect ratios.
2603.29540Computing the Voronoi diagram of mixed geometric objects in $R^3$ is challenging due to the high cost of exact geometric predicates via Cylindrical Algebraic Decomposition (CAD). We propose an efficient exact verification framework that characterizes the parameter space connectivity by computing certified topological transition sets. We analyze the fundamental non-quadric case: the trisector of two skew lines and one circle in $R^3$. Since the bisectors of circles and lines are not quadric surfaces, the pencil-of-quadrics analysis previously used for the trisectors of three lines is no longer applicable. Our pipeline uses exact symbolic evaluations to identify transition walls. Jacobian computations certify the absence of affine singularities, while projective closure shows singular behavior is isolated at a single point at infinity, $p_{\infty}$. Tangent-cone analysis at $p_{\infty}$ yields a discriminant $Δ_Q = 4ks^2(k-1)$, identifying $k=0,1$ as bifurcation values. Using directional blow-up coordinates, we rigorously verify that the trisector's real topology remains locally constant between these walls. Finally, we certify that $k=0,1$ are actual topological walls exhibiting reducible splitting. This work provides the exact predicates required for constructing mixed-object Voronoi diagrams beyond the quadric-only regime.
Persistent homology is a central tool in topological data analysis, but its application to large and noisy datasets is often limited by computational cost and the presence of spurious topological features. Noise not only increases data size but also obscures the underlying structure of the data. In this paper, we propose the Refined Characteristic Lattice Algorithm (RCLA), a grid-based method that integrates data reduction with threshold-based denoising in a single procedure. By incorporating a threshold parameter $k$, RCLA removes noise while preserving the essential structure of the data in a single pass. We further provide a theoretical guarantee by proving a stability theorem under a homogeneous Poisson noise model, which bounds the bottleneck distance between the persistence diagrams of the output and the underlying shape with high probability. In addition, we introduce an automatic parameter selection method based on nearest-neighbor statistics. Experimental results demonstrate that RCLA consistently outperforms existing methods, and its effectiveness is further validated on a 3D shape classification task.
The rise of 3D anime-style avatars in gaming, virtual reality, and other digital media has driven significant interest in automated generation methods capable of capturing their distinctive visual characteristics. These include stylized proportions, expressive features, and non-photorealistic rendering. This paper reviews the advancements and challenges in using deep learning in 3D anime-style avatar generation. We analyze the strengths and limitations of these methods in capturing the aesthetics of anime characters and supporting customization and animation. Additionally, we identify and discuss open problems in the field, such as difficulties in resolution and detail preservation, and constraints regarding the animation of hair and loose clothing. This article aims to provide a comprehensive overview of the current state-of-the-art and identify promising research directions for advancing 3D anime-style avatar generation.
2603.28268The $k$-means problem is a classic objective for modeling clustering in a metric space. Given a set of points in a metric space, the goal is to find $k$ representative points so as to minimize the sum of the squared distances from each point to its closest representative. In this work, we study the approximability of $k$-means in Euclidean spaces parameterized by the number of clusters, $k$. In seminal works, de la Vega, Karpinski, Kenyon, and Rabani [STOC'03] and Kumar, Sabharwal, and Sen [JACM'10] showed how to obtain a $(1+\varepsilon)$-approximation for high-dimensional Euclidean $k$-means in time $2^{(k/\varepsilon)^{O(1)}} \cdot dn^{O(1)}$. In this work, we introduce a new fine-grained hypothesis called Exponential Time for Expanders Hypothesis (XXH) which roughly asserts that there are no non-trivial exponential time approximation algorithms for the vertex cover problem on near perfect vertex expanders. Assuming XXH, we close the above long line of work on approximating Euclidean $k$-means by showing that there is no $2^{(k/\varepsilon)^{1-o(1)}} \cdot n^{O(1)}$ time algorithm achieving a $(1+\varepsilon)$-approximation for $k$-means in Euclidean space. This lower bound is tight as it matches the algorithm given by Feldman, Monemizadeh, and Sohler [SoCG'07] whose runtime is $2^{\tilde{O}(k/\varepsilon)} + O(ndk)$. Furthermore, assuming XXH, we show that the seminal $O(n^{kd+1})$ runtime exact algorithm of Inaba, Katoh, and Imai [SoCG'94] for $k$-means is optimal for small values of $k$.
In the (continuous) Euclidean $k$-center problem, given $n$ points in $\mathbb{R}^d$ and an integer $k$, the goal is to find $k$ center points in $\mathbb{R}^d$ that minimize the maximum Euclidean distance from any input point to its closest center. In this paper, we establish conditional lower bounds for this problem in constant dimensions in two settings. $\bullet$ Parameterized by $k$: Assuming the Exponential Time Hypothesis (ETH), we show that there is no $f(k)n^{o(k^{1-1/d})}$-time algorithm for the Euclidean $k$-center problem. This result shows that the algorithm of Agarwal and Procopiuc [SODA 1998; Algorithmica 2002] is essentially optimal. Furthermore, our lower bound rules out any $(1+\varepsilon)$-approximation algorithm running in time $(k/\varepsilon)^{o(k^{1-1/d})}n^{O(1)}$, thereby establishing near-optimality of the corresponding approximation scheme by the same authors. $\bullet$ Small $k$: Assuming the 3-SUM hypothesis, we prove that for any $\varepsilon>0$ there is no $O(n^{2-\varepsilon})$-time algorithm for the Euclidean $2$-center problem in $\mathbb{R}^3$. This settles an open question posed by Agarwal, Ben Avraham, and Sharir [SoCG 2010; Computational Geometry 2013]. In addition, under the same hypothesis, we prove that for any $\varepsilon > 0$, the Euclidean $6$-center problem in $\mathbb{R}^2$ also admits no $O(n^{2-\varepsilon})$-time algorithm. The technical core of all our proofs is a novel geometric embedding of a system of linear equations. We construct a point set where each variable corresponds to a specific collection of points, and the geometric structure ensures that a small-radius clustering is possible if and only if the system has a valid solution.
A furthest neighbor data structure on a metric space $(V,\mathrm{dist})$ and a set $P \subseteq V$ answers the following query: given $v \in V$, output $p \in P$ maximizing $\mathrm{dist}(v,p)$; in the approximate version, it is allowed to report any $p \in P$ with $\mathrm{dist}(v,p) \geq (1-\varepsilon)\max_{p' \in P} \mathrm{dist}(v,p')$ for an accuracy parameter $\varepsilon \in (0,1)$. A particular type of approximate furthest neighbor data structure is an $\varepsilon$-coreset: a small subset $Q \subseteq P$ such that for every query $v \in V$ there is a feasible answer $p \in Q$. Our main result is that in planar metrics there always exists an $\varepsilon$-coreset for furthest neighbors of size bounded polynomially in $(1/\varepsilon)$. This improves upon an exponential bound of Bourneuf and Pilipczuk [SODA'25] and resolves an open problem of de Berg and Theocharous [SoCG'24] for the case of polygons with holes. On the technical side, we develop a connection between $\varepsilon$-coreset for furthest neighbors and an invariant of a metric space that we call an $\varepsilon$-comatching index -- a sibling of $\varepsilon$-(semi-)ladder index, a.k.a, $\varepsilon$-scatter dimension, as defined by Abbasi et al [FOCS'23]. While the $\varepsilon$-(semi-)ladder index of planar metrics admits an exponential lower bound, we show that the $\varepsilon$-comatching index of planar metrics is polynomial, all in $1/\varepsilon$. The exponential separation between $\varepsilon$-(semi-)ladder and $\varepsilon$-comatching is rather surprising, and the proof is the main technical contribution of our work.
Advances in diffusion, autoregressive, and hybrid models have enabled high-quality image synthesis for tasks such as text-to-image, editing, and reference-guided composition. Yet, existing benchmarks remain limited, either focus on isolated tasks, cover only narrow domains, or provide opaque scores without explaining failure modes. We introduce \textbf{ImagenWorld}, a benchmark of 3.6K condition sets spanning six core tasks (generation and editing, with single or multiple references) and six topical domains (artworks, photorealistic images, information graphics, textual graphics, computer graphics, and screenshots). The benchmark is supported by 20K fine-grained human annotations and an explainable evaluation schema that tags localized object-level and segment-level errors, complementing automated VLM-based metrics. Our large-scale evaluation of 14 models yields several insights: (1) models typically struggle more in editing tasks than in generation tasks, especially in local edits. (2) models excel in artistic and photorealistic settings but struggle with symbolic and text-heavy domains such as screenshots and information graphics. (3) closed-source systems lead overall, while targeted data curation (e.g., Qwen-Image) narrows the gap in text-heavy cases. (4) modern VLM-based metrics achieve Kendall accuracies up to 0.79, approximating human ranking, but fall short of fine-grained, explainable error attribution. ImagenWorld provides both a rigorous benchmark and a diagnostic tool to advance robust image generation.