Processing All-Sky Images At Scale On The Amazon Cloud: A HiPS Example
G. Bruce Berriman, John C. Good
TL;DR
The study demonstrates a scalable, cloud-based pipeline for producing all-sky, infrared HiPS maps from large FITS datasets, using the Montage engine and HEALPix tiling to generate consistent brightness across tiles. By deploying Montage in Docker containers on AWS Batch, it enables parallel processing of 3328 sky tiles and yields a practical cost breakdown for cloud-based HiPS creation. The work highlights important engineering considerations for scientists seeking to perform large-scale image mosaicking on cloud platforms and shows the feasibility of HiPS at scale for infrared astronomy. This approach lowers barriers to accessing and visualizing vast all-sky infrared data and can be extended to larger image collections.
Abstract
We report here on a project that has developed a practical approach to processing all-sky image collections on cloud platforms, using as an exemplar application the creation of three-color Hierarchical Progressive Survey (HiPS) maps of the 2MASS data set with the Montage Image Mosaic Engine on Amazon Web Services. We will emphasize issues that must be considered by scientists wishing to use cloud platforms to perform such parallel processing, so providing a guide for scientists wishing to exploit cloud platforms for similar large-scale processing. A HiPS map is based on the HEALPix sky-tiling scheme. Progressive zooming of a HiPS map reveals an image sampled at ever smaller or larger spatial scales that are defined by the HEALPix standard. Briefly, the approach used by Montage involves creating a base mosaic at the lowest required HEALPix level, usually chosen to match as closely as possible the spatial sampling of the input images, then cutting out the HiPS cells in PNG format from this mosaic. The process is repeated at successive HEALPix levels to create a nested collection of FITS files, from which PNG files are created that are shown in HiPS viewers. Stretching FITS files to produce PNGs is based on an image histogram. For composite regions (up and including the whole sky), the histograms for each tile can be combined to create a composite histogram for the region. Using this single histogram for each of the individual FITS files means all the PNGs are on the same brightness scale and displaying them side by side in a HiPS viewer produces a continuous uniform map across the entire sky.
