TALICS$^3$: Tape Library Cloud Storage System Simulator
Suayb S. Arslan, James Peng, Turguy Goker
TL;DR
This work addresses the challenge of accurately modeling large-scale tape-library archives in cloud environments to estimate key KPIs like data access latency and reliability. It introduces TALICS^3, a discrete-event simulator that models two interdependent queues (DR and D) for data requests, configurable library geometry, object sizes, and exchange cycles, plus redundancy through replication or erasure coding and protocols governing retrieval; it supports single-enterprise and multi-library (RAIL-like) configurations. The authors derive a Poisson arrival rate linking workload to system parameters via $\\lambda = \\frac{NoC \\times C_t \\times \\Phi_f \\times \\textrm{AOTR} \\times k}{n \\times \\mu_o \\times T}$ and offer a rough queuing-theory framework ($M/G/c$) for latency analysis, complemented by numerical results showing scale-out LIBRARIES reduce latency and improve stability relative to a scale-up enterprise. Overall, TALICS^3 provides a practical design tool for reliability engineers to explore trade-offs in cold archival backends and to guide deployment decisions under cost and performance constraints.
Abstract
High performance computing data is surging fast into the exabyte-scale world, where tape libraries are the main platform for long-term durable data storage besides high-cost DNA. Tape libraries are extremely hard to model, but accurate modeling is critical for system administrators to obtain valid performance estimates for their designs. This research introduces a discrete--event tape simulation platform that realistically models tape library behavior in a networked cloud environment, by incorporating real-world phenomena and effects. The platform addresses several challenges, including precise estimation of data access latency, rates of robot exchange, data collocation, deduplication/compression ratio, and attainment of durability goals through replication or erasure coding. Using the {proposed} simulator, {one can} compare the single enterprise configuration with multiple commodity library configurations, making it a useful tool for system administrators and reliability engineers. This makes the simulator a valuable tool for system administrators and reliability engineers, enabling them to acquire practical and dependable performance estimates for their enduring, cost-efficient cold data storage architecture designs.
