Table of Contents
Fetching ...

Machine Learning vs. Randomness: Challenges in Predicting Binary Options Movements

Gabriel M. Arantes, Richard F. Pinto, Bruno L. Dalmazo, Eduardo N. Borges, Giancarlo Lucca, Viviane L. D. de Mattos, Fabian C. Cardoso, Rafael A. Berri

TL;DR

This study tackles whether machine learning can predict binary options movements in highly stochastic markets. By evaluating a broad set of algorithms (including RF, Logistic Regression, Gradient Boosting, kNN, MLP, and LSTM) with feature selection and Hyperband optimization on minute-level EUR/USD data from 2021–2023, and comparing against a ZeroR baseline, the authors quantify the absence of a learnable signal. The results show that none of the models surpass the baseline, with several architectures converging to majority-class predictions, and extended training not yielding transferable patterns. The findings challenge the feasibility of ML-driven forecasting in binary options and suggest pivoting to richer data sources or alternative modeling paradigms for noisy financial environments. Practically, the work cautions practitioners about overestimating ML capabilities in highly random, speculative markets and highlights the need for more informative signals beyond traditional indicators like SMA and RSI.

Abstract

Binary options trading is often marketed as a field where predictive models can generate consistent profits. However, the inherent randomness and stochastic nature of binary options make price movements highly unpredictable, posing significant challenges for any forecasting approach. This study demonstrates that machine learning algorithms struggle to outperform a simple baseline in predicting binary options movements. Using a dataset of EUR/USD currency pairs from 2021 to 2023, we tested multiple models, including Random Forest, Logistic Regression, Gradient Boosting, and k-Nearest Neighbors (kNN), both before and after hyperparameter optimization. Furthermore, several neural network architectures, including Multi-Layer Perceptrons (MLP) and a Long Short-Term Memory (LSTM) network, were evaluated under different training conditions. Despite these exhaustive efforts, none of the models surpassed the ZeroR baseline accuracy, highlighting the inherent randomness of binary options. These findings reinforce the notion that binary options lack predictable patterns, making them unsuitable for machine learning-based forecasting.

Machine Learning vs. Randomness: Challenges in Predicting Binary Options Movements

TL;DR

This study tackles whether machine learning can predict binary options movements in highly stochastic markets. By evaluating a broad set of algorithms (including RF, Logistic Regression, Gradient Boosting, kNN, MLP, and LSTM) with feature selection and Hyperband optimization on minute-level EUR/USD data from 2021–2023, and comparing against a ZeroR baseline, the authors quantify the absence of a learnable signal. The results show that none of the models surpass the baseline, with several architectures converging to majority-class predictions, and extended training not yielding transferable patterns. The findings challenge the feasibility of ML-driven forecasting in binary options and suggest pivoting to richer data sources or alternative modeling paradigms for noisy financial environments. Practically, the work cautions practitioners about overestimating ML capabilities in highly random, speculative markets and highlights the need for more informative signals beyond traditional indicators like SMA and RSI.

Abstract

Binary options trading is often marketed as a field where predictive models can generate consistent profits. However, the inherent randomness and stochastic nature of binary options make price movements highly unpredictable, posing significant challenges for any forecasting approach. This study demonstrates that machine learning algorithms struggle to outperform a simple baseline in predicting binary options movements. Using a dataset of EUR/USD currency pairs from 2021 to 2023, we tested multiple models, including Random Forest, Logistic Regression, Gradient Boosting, and k-Nearest Neighbors (kNN), both before and after hyperparameter optimization. Furthermore, several neural network architectures, including Multi-Layer Perceptrons (MLP) and a Long Short-Term Memory (LSTM) network, were evaluated under different training conditions. Despite these exhaustive efforts, none of the models surpassed the ZeroR baseline accuracy, highlighting the inherent randomness of binary options. These findings reinforce the notion that binary options lack predictable patterns, making them unsuitable for machine learning-based forecasting.

Paper Structure

This paper contains 20 sections, 2 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Training and validation accuracy of the MLP over epochs. The divergence between the rising training accuracy (blue) and the decreasing validation accuracy (orange) is a clear sign of overfitting. The final test accuracy on the comprehensive unseen dataset (red dot) matched the ZeroR baseline exactly, confirming that the patterns memorized during training failed to generalize.
  • Figure 2: Confusion matrices from models prior to hyperparameter optimization, illustrating three distinct predictive behaviors: (a) the baseline majority-class prediction, (b) a minimal deviation from the baseline, and (c) an unsuccessful attempt to classify both classes.
  • Figure 3: Final accuracy comparison of all primary models against the ZeroR baseline.