A Scalable System for Visual Analysis of Ocean Data

Toshit Jain; Upkar Singh; Varun Singh; Vijay Kumar Boda; Ingrid Hotz; Sathish S. Vadhiyar; P. N. Vinayachandran; Vijay Natarajan

A Scalable System for Visual Analysis of Ocean Data

Toshit Jain, Upkar Singh, Varun Singh, Vijay Kumar Boda, Ingrid Hotz, Sathish S. Vadhiyar, P. N. Vinayachandran, Vijay Natarajan

TL;DR

This paper introduces pyParaOcean, a scalable, interactive visualization system for 3D time-varying ocean data built as ParaView plugins to leverage parallelism and To accelerate exploratory analysis of complex features such as eddies and surface fronts. It integrates a Cinema database generator to overcome I/O bottlenecks and provides ocean-specific filters for seed-based fieldlines, isovolumes, depth profiles, front tracking, and eddy detection, all within a client-server ParaView architecture. Comprehensive scaling studies on ROMS and GLORYS datasets demonstrate near-linear performance and the effectiveness of depth-based partitioning and data redistribution, with practical gains in I/O via Cinema. A detailed Bay of Bengal case study showcases the system’s ability to analyze monsoon-driven currents, salinity transport, and submesoscale filaments, highlighting its potential for rapid, in-depth oceanographic analysis. Overall, pyParaOcean offers a flexible, extensible framework that can adapt to larger datasets and other geoscience domains, enabling researchers to efficiently visualize and track dynamic ocean processes.

Abstract

Oceanographers rely on visual analysis to interpret model simulations, identify events and phenomena, and track dynamic ocean processes. The ever increasing resolution and complexity of ocean data due to its dynamic nature and multivariate relationships demands a scalable and adaptable visualization tool for interactive exploration. We introduce pyParaOcean, a scalable and interactive visualization system designed specifically for ocean data analysis. pyParaOcean offers specialized modules for common oceanographic analysis tasks, including eddy identification and salinity movement tracking. These modules seamlessly integrate with ParaView as filters, ensuring a user-friendly and easy-to-use system while leveraging the parallelization capabilities of ParaView and a plethora of inbuilt general-purpose visualization functionalities. The creation of an auxiliary dataset stored as a Cinema database helps address I/O and network bandwidth bottlenecks while supporting the generation of quick overview visualizations. We present a case study on the Bay of Bengal (BoB) to demonstrate the utility of the system and scaling studies to evaluate the efficiency of the system.

A Scalable System for Visual Analysis of Ocean Data

TL;DR

Abstract

Paper Structure (27 sections, 18 figures)

This paper contains 27 sections, 18 figures.

Introduction
Related work
Contributions
Data
pyParaOcean: Design and architecture
Parallelism
Load balancing and ghost cells
pyParaOcean: Functionalities
Seed placement and fieldlines visualization
Isovolumes and isosurfaces
Depth profile view
Front tracking
Eddy identification and visualization
I/O and the NetCDF format
Cinema database generator
...and 12 more sections

Figures (18)

Figure 1: pyParaOcean functionality and user interface. (A) All pyParaOcean modules are implemented as ParaView filters. (B) ParaView pipeline browser shows the different datasets under study and the filters applied on them. (C) The seeding filter from pyParaOcean provides multiple options for tracing fieldlines. The figure illustrates the usage of various filters showcasing (D) salinity visualization using volume rendering, (E) interactive depth profile query visualization, (F) multivariate data visualization using a parallel coordinates plot of all fields in the dataset, (G) flow visualization with streamlines, (H) interactively seeded pathline visualization, (I) eddy detection and visualization, and (J) tracking high salinity water movement with a surface front track.
Figure 2: pyParaOcean system architecture.
Figure 3: (a) An example 2D mesh. (b) The mesh is partitioned for parallel processing. Four processes are created and each chunk is sent to a different process. Each process is represented by a unique color ( ). The vtkDataSetSurfaceFilter filter is applied to the dataset to compute its boundary. (c) Output of the filter when there are no ghost cells. The filter incorrectly reports edges from the interior of the data as boundary. (d) Ghost cells are inserted on the common boundary between the individual partitions. The filter now reports the correct output, since all false positives are attached to the ghost cells, which are eliminated from the final output. (e) An alternate partitioning of the data into four processes. (f) After the addition of a ghost cell layer, it is apparent that this is an inefficient partitioning of the data since all processes work on almost the entire data, due to poor load balancing.
Figure 4: Cinema view with sliding toggles to scroll through the depth slices, time steps, and different scalar fields
Figure 5: Partitioning the spatial domain of the ROMS dataset for efficient visualization. Each block in the partition, represented using a unique color, is sent to a unique core that processes the data within the block independently. (a,c) partitioning into 2 blocks. (b,d) partitioning into 4 blocks (left) and the corresponding ghost cells (right). (e,f) partitioning the domain into 144 blocks. (g) partitioning the domain into 8 blocks. BoB is shown overlaid in solid orange, indicating that several blocks are restricted to land and hence are not assigned any computational task.
...and 13 more figures

A Scalable System for Visual Analysis of Ocean Data

TL;DR

Abstract

A Scalable System for Visual Analysis of Ocean Data

Authors

TL;DR

Abstract

Table of Contents

Figures (18)