AROhI: An Interactive Tool for Estimating ROI of Data Analytics
Noopur Zambare, Jacob Idoko, Jagrit Acharya, Gouri Ginde
TL;DR
AROhI addresses the need for ROI-aware evaluation of data analytics by introducing a no-code interactive dashboard that combines conventional ML approaches with advanced methods such as Active Learning and fine-tuned BERT to estimate ROI. The tool enables decision-makers to adjust cost and benefit factors and visualize trade-offs between ML performance (e.g., F1) and ROI, demonstrated on a requirements-dependency extraction use case and deployed on AWS. Key contributions include a concrete ROI model linking benefits (true positives and penalties) to costs (data prep, labeling, and resources) and a workflow for comparing supervised and semi-supervised learners under cost constraints. The work advances practical, ROI-driven decision-making in software analytics and outlines directions for extending to broader datasets, unsupervised learning, and LLM-based ROI calculations.
Abstract
The cost of adopting new technology is rarely analyzed and discussed, while it is vital for many software companies worldwide. Thus, it is crucial to consider Return On Investment (ROI) when performing data analytics. Decisions on "How much analytics is needed"? are hard to answer. ROI could guide decision support on the What?, How?, and How Much? Analytics for a given problem. This work details a comprehensive tool that provides conventional and advanced ML approaches for demonstration using requirements dependency extraction and their ROI analysis as use case. Utilizing advanced ML techniques such as Active Learning, Transfer Learning and primitive Large language model: BERT (Bidirectional Encoder Representations from Transformers) as its various components for automating dependency extraction, the tool outcomes demonstrate a mechanism to compute the ROI of ML algorithms to present a clear picture of trade-offs between the cost and benefits of a technology investment.
