Deep autoregressive modeling for land use land cover

Christopher Krapu; Mark Borsuk; Ryan Calder

Deep autoregressive modeling for land use land cover

Christopher Krapu, Mark Borsuk, Ryan Calder

TL;DR

The paper tackles the challenge of modeling land use / land cover (LULC) with long-range spatial dependencies by adapting a deep autoregressive, image-inpainting–style approach. It introduces PixelConstrainedCNN, a large-scale discrete autoregressive model that combines a PixelCNN prior with a ResNet-based auxiliary network to generate calibrated, conditional LULC distributions over regional patches; it is benchmarked against a SCCAR statistical baseline using NLCD 2019 data coarsened to $40\times40$ grids. Results show that the deep model captures intricate spatial patterns such as roads and water bodies and generates realistic completions, but its predictive distributions are underconfident (underdispersed) and calibration remains a challenge, with temperature tuning offering partial improvements. A Michigan state park case study demonstrates sequential infilling for larger regions and highlights the model’s potential for counterfactual landscape analysis, while pointing to limitations in long-range context and sampling efficiency. The work suggests future work to integrate additional data layers, scale to larger regions, and explore alternative architectures to improve calibration and practicality for LULC change assessment.

Abstract

Land use / land cover (LULC) modeling is a challenging task due to long-range dependencies between geographic features and distinct spatial patterns related to topography, ecology, and human development. We identify a close connection between modeling of spatial patterns of land use and the task of image inpainting from computer vision and conduct a study of a modified PixelCNN architecture with approximately 19 million parameters for modeling LULC. In comparison with a benchmark spatial statistical model, we find that the former is capable of capturing much richer spatial correlation patterns such as roads and water bodies but does not produce a calibrated predictive distribution, suggesting the need for additional tuning. We find evidence of predictive underdispersion with regard to important ecologically-relevant land use statistics such as patch count and adjacency which can be ameliorated to some extent by manipulating sampling variability.

Deep autoregressive modeling for land use land cover

TL;DR

grids. Results show that the deep model captures intricate spatial patterns such as roads and water bodies and generates realistic completions, but its predictive distributions are underconfident (underdispersed) and calibration remains a challenge, with temperature tuning offering partial improvements. A Michigan state park case study demonstrates sequential infilling for larger regions and highlights the model’s potential for counterfactual landscape analysis, while pointing to limitations in long-range context and sampling efficiency. The work suggests future work to integrate additional data layers, scale to larger regions, and explore alternative architectures to improve calibration and practicality for LULC change assessment.

Abstract

Paper Structure (13 sections, 3 equations, 11 figures, 2 tables)

This paper contains 13 sections, 3 equations, 11 figures, 2 tables.

Introduction
Data and Methods
Methods
Data
Models
Analyses & Results
Unconditional sampling
Image inpainting
Diversity & calibration
Comparison with SCCAR model
Case study
Discussion
Conclusion

Figures (11)

Figure 1: Examples of training data at a $40\times 40$ resolution. Common cover co-occurrence patterns include pasture and cropland, woody wetlands and forest, as well as high/medium/low levels of urban development. Each image covers 23.04 km$^2$.
Figure 2: Input regions for components of the PixelConstrainedCNN architecture. The image completion spanning the white cells (middle region) above is sampled using a distribution modeled using an autoregressive PixelCNN network and an auxiliary ResNet network. The autoregressive network is only provided the pixels from the top region due to the design of its masked convolutional layers, while the ResNet auxiliary network uses pixels from the entire observed image (top and bottom regions). The prior context and conditional context for the single white pixel in row 4, column 5, counting from the left and top, are specified as shown.
Figure 3: Architecture for the PixelConstrainedCNN model. The autoregressive PixelCNN model consists of gated convolutional blocks that use masks to prevent information leakage from future pixels during training. The auxiliary ResNet network conditions the PixelCNN model on the observed pixels and includes residual and squeeze-and-excite layers. Elementwise addition is indicated by $\oplus$ and multiplication by $\odot$.
Figure 4: Image completions using the spatial categorical conditional autoregressive model. These completions help to highlight spatial correlation patterns over a larger area.
Figure 5: Plate diagram of the benchmark latent conditional autoregressive model. The observed adjacency matrix $\bm{D}$ informs the spatial covariance of $\omega_{\cdot k}$. Cross-class correlations between elements of $\bm{U}$ are induced via the matrix $\bm{A}$. Shaded circles represent observed quantities while unshaded circles denote parameters to estimate. $N$ is the number of pixels in the image, and $K$ is the number of land cover classes. $K'$ refers to the second index running over ${1,...,K}$, which is only used to index the columns of the matrix $\bm{A}$.
...and 6 more figures

Deep autoregressive modeling for land use land cover

TL;DR

Abstract

Deep autoregressive modeling for land use land cover

Authors

TL;DR

Abstract

Table of Contents

Figures (11)