Algorithmic trading, market structure, and statistical arbitrage
In this work we evaluate the performance of three classes of methods for detecting financial anomalies: topological data analysis (TDA), principal component analyis (PCA), and Neural Network-based approaches. We apply these methods to the TSX-60 data to identify major financial stress events in the Canadian stock market. We show how neural network-based methods (such as GlocalKD and One-Shot GIN(E)) and TDA methods achieve the strongest performance. The effectiveness of TDA in detecting financial anomalies suggests that global topological properties are meaningful in distinguishing financial stress events.
Daily probability changes in Kalshi macro prediction markets forecast cryptocurrency realized volatility through two distinct channels. The monetary policy channel, measured by Fed rate repricing on KXFED contracts, predicts Bitcoin volatility in sample with t = 3.63 and p < 0.001 but exhibits regime dependence tied to the 2024-2025 rate-cutting cycle. The recession risk signal from KXRECSSNBER proves more stable out of sample, delivering an MSFE ratio of 0.979 with Clark-West p = 0.020. The inflation channel, measured by CPI repricing on KXCPI contracts, predicts altcoin volatility for Ethereum, Solana, Cardano, and Chainlink with t-statistics ranging from -2.1 to -3.4 and out-of-sample gains for Ethereum at MSFE = 0.959 with p = 0.010 and Solana at p = 0.048. Both the Bitcoin--Fed-dovish and Chainlink--CPI specifications survive Benjamini-Hochberg correction at q = 0.05. Orthogonalization and baseline comparisons against Fed Funds futures, Treasury yields, and the Deribit implied volatility index confirm that these signals carry information not embedded in conventional financial instruments. The sample covers ten Kalshi event series and six cryptocurrency assets over January 2023 to March 2026.
This paper presents a method for forecasting limit order book durations using a self-exciting flexible residual point process. High-frequency events in modern exchanges exhibit heavy-tailed interarrival times, posing a significant challenge for accurate prediction. The proposed approach incorporates the empirical distributional features of interarrival times while preserving the self-exciting and decay structure. This work also examines the stochastic stability of the process, which can be interpreted as a general state-space Markov chain. Under suitable conditions, the process is irreducible, aperiodic, positive Harris recurrent, and has a stationary distribution. An empirical study demonstrates that the model achieves strong predictive performance compared with several alternative approaches when forecasting durations in ultra-high-frequency trading data.
We address the problem of executing large client orders in continuous double-auction markets under time and liquidity constraints. We propose a model predictive control (MPC) framework that balances three competing objectives: order completion, market impact, and opportunity cost. Our algorithm is guided by a trading schedule (such as time-weighted average price or volume-weighted average price) but allows for deviations to reduce the expected execution cost, with due regard to risk. Our MPC algorithm executes the order progressively, and at each decision step it solves a fast quadratic program that trades off expected transaction cost against schedule deviation, while incorporating a residual cost term derived from a simple base policy. Approximate schedule adherence is maintained through explicit bounds, while variance constraints on deviation provide direct risk control. The resulting system is modular, data-driven, and suitable for deployment in production trading infrastructure. Using six months of NASDAQ 'level 3' data and simulated orders, we show that our MPC approach reduces schedule shortfall by approximately 40-50% relative to spread-crossing benchmarks and achieves significant reductions in slippage. Moreover, augmenting the base policy with predictive price information further enhances performance, highlighting the framework's flexibility for integration with forecasting components.
KAN-PCA is an autoencoder that uses a KAN as encoder and a linear map as decoder. It generalizes classical PCA by replacing linear projections with learned B-spline functions on each edge. The motivation is to capture more variance than classical PCA, which becomes inefficient during market crises when the linear assumption breaks down and correlations between assets change dramatically. We prove that if the spline activations are forced to be linear, KAN-PCA yields exactly the same results as classical PCA, establishing PCA as a special case. Experiments on 20 S&P 500 stocks (2015-2024) show that KAN-PCA achieves a reconstruction R^2 of 66.57%, compared to 62.99% for classical PCA with the same 3 factors, while matching PCA out-of-sample after correcting for data leakage in the training procedure.
We introduce a practical, interactive simulator of the limit order book for large-tick assets, designed to produce realistic execution, costs, and P&L. The book state is projected onto a tractable representation based on spread and volume imbalance, enabling robust estimation from market data. Event timing is calibrated to reproduce the fine-scale temporal structure of real markets, revealing a pronounced mode at exchange round-trip latency consistent with simultaneous reactions and latency races among participants. We further incorporate a feedback mechanism that accumulates signed trade flow through a power-law decay kernel, reproducing both concave market impact during execution and partial post-trade reversion. Across several stocks and strategy case studies, the simulator yields realistic behavior where profitability becomes highly sensitive to execution parameters. We present the approach as a practical recipe: project, estimate, validate, adapt, for building realistic limit order book simulations.
We present FinRL-X, a modular and deployment-consistent trading architecture that unifies data processing, strategy construction, backtesting, and broker execution under a weight-centric interface. While existing open-source platforms are often backtesting- or model-centric, they rarely provide system-level consistency between research evaluation and live deployment. FinRL-X addresses this gap through a composable strategy pipeline that integrates stock selection, portfolio allocation, timing, and portfolio-level risk overlays within a unified protocol. The framework supports both rule-based and AI-driven components, including reinforcement learning allocators and LLM-based sentiment signals, without altering downstream execution semantics. FinRL-X provides an extensible foundation for reproducible, end-to-end quantitative trading research and deployment. The official FinRL-X implementation is available at https://github.com/AI4Finance-Foundation/FinRL-Trading.
2603.20965This paper studies whether a lightweight trained aggregator can combine diverse zero-shot large language model judgments into a stronger downstream signal for corporate disclosure classification. Zero-shot LLMs can read disclosures without task-specific fine-tuning, but their predictions often vary across prompts, reasoning styles, and model families. I address this problem with a multi-agent framework in which three zero-shot agents independently read each disclosure and output a sentiment label, a confidence score, and a short rationale. A logistic meta-classifier then aggregates these signals to predict next-day stock return direction. I use a sample of 18,420 U.S. corporate disclosures issued by Nasdaq and S&P 500 firms between 2018 and 2024, matched to next-day stock returns. Results show that the trained aggregator outperforms all single agents, majority vote, confidence-weighted voting, and a FinBERT baseline. Balanced accuracy rises from 0.561 for the best single agent to 0.612 for the trained aggregator, with the largest gains in disclosures combining strong current performance with weak guidance or elevated risk. The results suggest that zero-shot LLM agents capture complementary financial signals and that supervised aggregation can turn cross-agent disagreement into a more useful classification target.
We propose a Neural Hidden Markov Model (HMM) with Adaptive Granularity Attention (AGA) for high-frequency order flow modeling. The model addresses the challenge of capturing multi-scale temporal dynamics in financial markets, where fine-grained microstructure signals and coarse-grained liquidity trends coexist. The proposed framework integrates parallel multi-resolution encoders, including a dilated convolutional network for tick-level patterns and a wavelet-LSTM module for low-frequency dynamics. A gating mechanism conditioned on local volatility and transaction intensity adaptively fuses multi-scale representations, while a multi-head attention layer further enhances temporal dependency modeling. Within this architecture, a Neural HMM with conditional normalizing flow emissions is employed to jointly model latent market regimes and complex observation distributions. Empirical results on high-frequency limit order book data demonstrate that the proposed model outperforms fixed-resolution baselines in predicting short-term price movements and liquidity shocks. The adaptive granularity mechanism enables the model to dynamically adjust its focus across time scales, providing improved performance particularly during volatile market conditions.
Rendering fair prices for financial, credit, and insurance products is of ethical and regulatory interest. In many jurisdictions, discriminatory covariates, such as gender and ethnicity, are prohibited from use in pricing such instruments. In this work, we propose a discrimination-insensitive pricing framework, where we require the pricing principle to be insensitive to the (exogenously determined) protected covariates, that is the sensitivity of the pricing principle to the protected covariate is zero. We formulate and solve the optimisation problem that finds the nearest (in Kullback-Leibler (KL) divergence) "pricing" measure to the real world probability, such that under this pricing measure the principle is discrimination-insensitive. We call the solution the discrimination-insensitive measure and provide conditions for its existence and uniqueness. In situations when there are more than one protected covariates, the discrimination-insensitive pricing measure might not exist, and we propose a two-step procedure. First, for each protected covariate separately, we find the measure under which the pricing principle becomes insensitivity to that covariate. Second we reconcile these measures through a constrained barycentre model. We provide a close-form solution to this problem and give conditions for existence and uniqueness of the constrained barycentre pricing measure. As an intermediary result, we prove the representation, existence, and uniqueness of the KL barycentre of general probability measures, which may be of independent interest. Finally, in a numerical illustration, we compare our discrimination-insensitive premia and the constrained barycentre pricing measure with recently proposed fair premia from the actuarial literature.
We study optimal auction design for Maximum Extractable Value (MEV) auction markets on Ethereum. Using a dataset of 2.2 million transactions across three major orderflow providers, we establish three empirical regularities: extracted values follow a log-normal distribution with extreme right-tail concentration, competition intensity varies substantially across MEV types, and the standard Revenue Equivalence Theorem breaks down due to affiliation among searchers' valuations. We model this affiliation through a Gaussian common factor, deriving equilibrium bidding strategies and expected revenues for five auction formats, first-price sealed-bid, second-price sealed-bid, English, Dutch, and all-pay, across a fine grid of bidder counts $n$ and affiliation parameters $ρ$. Our simulations confirm the Milgrom-Weber linkage principle: English and second-price sealed-bid auctions strictly dominate Dutch and first-price sealed-bid formats for any $ρ> 0$, with a linkage gap of 14-28\% at moderate affiliation ($ρ=0.5$) and up to 30\% for small bidder counts. Applied to observed bribe totals, this gap corresponds to \$10-18 million in foregone revenue over the sample period. We also document a novel non-monotonicity: at large $n$ and high $ρ$, revenue peaks in the interior of the affiliation parameter space and declines thereafter, as near-perfect correlation collapses the order-statistic spread that drives competitive payments.
Whether heterogeneous investor flows transmit private information across stocks or merely reflect coordinated responses to public signals remains an open question in market microstructure. We construct Transfer Entropy (TE) networks from investor-type flows -- foreign, institutional, and individual -- for \numNStocks{} Korean equities over \numNDates{} trading days (January 2020 to February 2025), and evaluate their economic content through interaction information (II), conditional TE, mutual information (MI), Kelly criterion bounds, and Fama-MacBeth regressions. Three findings emerge. First, TE networks are sparse and structurally heterogeneous: foreign investors maintain few but strong links (\numEdgesFor{} edges, mean TE = \numMeanTEFor{}), while individual investors form many but weak links (\numEdgesInd{} edges, mean TE = \numMeanTEInd{}). Second, cross-investor information is redundant rather than synergistic, no investor type directionally dominates another, and MI between signals and returns is zero at the daily horizon. Third, network centrality adds negligible alpha in cross-sectional regressions, with only one of six signal-centrality interactions reaching marginal significance. These results indicate that the observed propagation structure captures shared information processing rather than private signal cascades, consistent with daily-frequency market efficiency.
Trend-following strategies underpin many systematic trading approaches yet struggle under nonstationary and nonlinear market regimes. We propose an LSTM-based framework to forecast next-day trend differences ($Δ_t$) for the top 30 S\&P 500 equities, validated across market cycles (2005--2025). Key contributions include: (i) formal proof of bias-variance reduction via differencing, (ii) exhaustive empirical benchmarks against OLS, Ridge, and Lasso, (iii) portfolio simulations confirming economic gains in terms of overall PNL compared to other models like OLS, Ridge, Lasso or LightGBM Regressor
The first 100 days of Donald Trump second presidential term (January 20th - April 30th, 2025) featured policy actions with potential market repercussions, constituting a well-suited case study of a concentrated policy scenario. Here, we provide a first look at this period, rooted in the information theory, by analyzing major stock indices across the Americas, Europe as well as Asia and Oceania. Our approach jointly examines dispersion (standard deviation) and information complexity (entropy), but also employs a sliding window cumulative entropy to localize extreme events. We find a notable decoupling between the first two measures, indicating that entropy is not merely a proxy for amplitude but reflects the diversity of populated outcomes. As such, they allow us to capture both market volatility and narrative constraints, signaling large and coherent moves driven by policy changes. In turn, the cumulative entropy is found to notably increase during regional episodes with high information density, providing effective signatures of such events. We argue that the obtained results indicate short-term globally coupled, yet regionally modulated, market impacts with clear connection to introduced policies. In what follows, the presented entropic framework emerges as an efficient complement to standard methods for characterizing markets under turbulent conditions, with potential to enhance forecasting strategies such as the stochastic modeling.
Forecasting crude oil prices remains challenging because market-relevant information is embedded in large volumes of unstructured news and is not fully captured by traditional polarity-based sentiment measures. This paper examines whether multi-dimensional sentiment signals extracted by large language models improve the prediction of weekly WTI crude oil futures returns. Using energy-sector news articles from 2020 to 2025, we construct five sentiment dimensions covering relevance, polarity, intensity, uncertainty, and forwardness based on GPT-4o, Llama 3.2-3b, and two benchmark models, FinBERT and AlphaVantage. We aggregate article-level signals to the weekly level and evaluate their predictive performance in a classification framework. The best results are achieved by combining GPT-4o and FinBERT, suggesting that LLM-based and conventional financial sentiment models provide complementary predictive information. SHAP analysis further shows that intensity- and uncertainty-related features are among the most important predictors, indicating that the predictive value of news sentiment extends beyond simple polarity. Overall, the results suggest that multi-dimensional LLM-based sentiment measures can improve commodity return forecasting and support energy-market risk monitoring.
Generating synthetic financial time series that preserve statistical properties of real market data is essential for stress testing, risk model validation, and scenario design. Existing approaches, from parametric models to deep generative networks, struggle to simultaneously reproduce heavy-tailed distributions, negligible linear autocorrelation, and persistent volatility clustering. We propose a hybrid hidden Markov framework that discretizes continuous excess growth rates into Laplace quantile-defined market states and augments regime switching with a Poisson-driven jump-duration mechanism to enforce realistic tail-state dwell times. Parameters are estimated by direct transition counting, bypassing the Baum-Welch EM algorithm. Synthetic data quality is evaluated using Kolmogorov-Smirnov and Anderson-Darling pass rates for distributional fidelity, and ACF mean absolute error for temporal structure. Applied to ten years of SPY data across 1,000 simulated paths, the framework achieves KS and AD pass rates exceeding 97% and 91% in-sample and 94% out-of-sample (calendar year 2025), partially reproducing the ARCH effect that standard regime-switching models miss. No single model dominates all quality dimensions: GARCH(1,1) reproduces volatility clustering more accurately but fails distributional tests (5.5% KS pass rate), while the standard HMM without jumps achieves higher distributional fidelity but cannot generate persistent high-volatility regimes. The proposed framework offers the best joint quality profile across distributional, temporal, and tail-coverage metrics. A Single-Index Model extension propagates the SPY factor path to a 424-asset universe, enabling scalable correlated synthetic path generation while preserving cross-sectional correlation structure.
A common practice in empirical finance is to construct calendar-aligned panels that implicitly treat all instruments as having existed for the full observation period. When securities with different listing histories are combined without explicit coverage constraints, price histories can be inadvertently extended before valid trading ever began. This paper formalizes this problem and proposes a coverage-aware structuring framework built around instrument-level observation windows encoded through structured metadata and an availability matrix. Applied to end-of-day data from the Dhaka Stock Exchange spanning October 2012 to January 2026 and covering 486 instruments, the framework reveals substantial distortions from naive temporal alignment. ARIMA-based experiments establish the mechanism through which padded observations corrupt return dynamics, and volatility analysis across 53 instruments shows that forward-filling alone suppresses return volatility by roughly 20% on average, with GARCH unconditional variance distortions exceeding 26% in over 90% of instruments - a lower bound, as backward extension to the panel start produces 36.6% suppression and causes GARCH non-convergence in 41% of instruments. The distortion affects any method requiring calendar alignment of heterogeneous histories, including dynamic time warping, covariance-based portfolio construction, factor model regression, and temporal foundation model fine-tuning. Although demonstrated on financial data, the framework applies to any panel combining entities with heterogeneous entry dates, including sensor networks, clinical cohorts, and country-level economic panels. Listing coverage is not a minor preprocessing detail but a first-order variable in panel construction.
Predicting stock prices presents challenges in financial forecasting. While traditional approaches such as ARIMA and RNNs are prevalent, recent developments in Large Language Models (LLMs) offer alternative methodologies. This paper introduces an approach that integrates LLMs with daily financial news for stock price prediction. To address the challenge of processing news data and identifying relevant content, we utilize stock name embeddings within attention mechanisms. Specifically, we encode news articles using a pre-trained LLM and implement three attention-based pooling techniques -- self-attentive, cross-attentive, and position-aware self-attentive pooling -- to filter news based on stock relevance. The filtered news embeddings, combined with historical stock prices, serve as inputs to the prediction model. Unlike prior studies that focus on individual stocks, our method trains a single generalized model applicable across multiple stocks. Experimental results demonstrate a 7.11% reduction in Mean Absolute Error (MAE) compared to the baseline, indicating the utility of stock name embeddings for news filtering and price forecasting within a generalized framework.
Extreme values and the tail behavior of probability distributions are essential for quantifying and mitigating risk in complex systems of all kinds. In multivariate settings, accounting for correlations is crucial. Although extreme value analysis for infinite correlated systems remains an open challenge, we propose a practical framework for handling a large but finite number of correlated time series. We develop our approach for finance as a concrete example but emphasize its generality. We study the extremal behavior of high-frequency stock returns after rotating them into the eigenbasis of the correlation matrix. This separates and extracts various collective effects, including information on the correlated market as a whole and on correlated sectoral behavior from idiosyncratic features, while allowing us to use univariate tools of extreme value analysis. This holds even for high-frequency data where discretization effects normally complicate analysis. We employ a peaks-over-threshold approach and thereby fully avoid the analysis of block maxima. We estimate the tail shape of the rotated returns while explicitly accounting for nonstationarity, a key feature in finance and many other complex systems. Our framework facilitates tail risk estimation relative to larger trends and intraday seasonalities at both market and sectoral levels.
This paper develops a robust parametric framework for jump detection in discretely observed CKLS-type jump-diffusion processes with high-frequency asymptotics, based on the minimum density power divergence estimator (MDPDE). The methodology exploits the intrinsic asymptotic scale separation between diffusion increments, which decay at rate $\sqrt{Δ_n}$, and jump increments, which remain of non-vanishing stochastic magnitude. Using robust MDPDE-based estimators of the drift and diffusion coefficients, we construct standardized residuals whose extremal behavior provides a principled basis for statistical discrimination between continuous and discontinuous components. We establish that, over diffusion intervals, the maximum of the normalized residuals converges to the Gumbel extreme-value distribution, yielding an explicit and asymptotically valid detection threshold. Building on this result, we prove classification consistency of the proposed robust detection procedure: the probability of correctly identifying all jump and diffusion increments converges to one under proper asymptotics. The MDPDE-based normalization attenuates the influence of atypical increments and stabilizes the detection boundary in the presence of discontinuities. Simulation results confirm that robustness improves finite-sample stability and reduces spurious detections without compromising asymptotic validity. The proposed methodology provides a theoretically rigorous and practically resilient robust approach to jump identification in high-frequency stochastic systems.