Table of Contents
Fetching ...

A Generative AI Technique for Synthesizing a Digital Twin for U.S. Residential Solar Adoption and Generation

Aparna Kishore, Swapna Thorve, Madhav Marathe

TL;DR

A novel methodology to generate a highly granular, residential-scale realistic dataset for rooftop solar adoption across the contiguous United States and demonstrates increased rooftop solar adoption with the 30\% Federal Solar Investment Tax Credit, especially in Low-to-Moderate-Income communities.

Abstract

Residential rooftop solar adoption is considered crucial for reducing carbon emissions. The lack of photovoltaic (PV) data at a finer resolution (e.g., household, hourly levels) poses a significant roadblock to informed decision-making. We discuss a novel methodology to generate a highly granular, residential-scale realistic dataset for rooftop solar adoption across the contiguous United States. The data-driven methodology consists of: (i) integrated machine learning models to identify PV adopters, (ii) methods to augment the data using explainable AI techniques to glean insights about key features and their interactions, and (iii) methods to generate household-level hourly solar energy output using an analytical model. The resulting synthetic datasets are validated using real-world data and can serve as a digital twin for modeling downstream tasks. Finally, a policy-based case study utilizing the digital twin for Virginia demonstrated increased rooftop solar adoption with the 30\% Federal Solar Investment Tax Credit, especially in Low-to-Moderate-Income communities.

A Generative AI Technique for Synthesizing a Digital Twin for U.S. Residential Solar Adoption and Generation

TL;DR

A novel methodology to generate a highly granular, residential-scale realistic dataset for rooftop solar adoption across the contiguous United States and demonstrates increased rooftop solar adoption with the 30\% Federal Solar Investment Tax Credit, especially in Low-to-Moderate-Income communities.

Abstract

Residential rooftop solar adoption is considered crucial for reducing carbon emissions. The lack of photovoltaic (PV) data at a finer resolution (e.g., household, hourly levels) poses a significant roadblock to informed decision-making. We discuss a novel methodology to generate a highly granular, residential-scale realistic dataset for rooftop solar adoption across the contiguous United States. The data-driven methodology consists of: (i) integrated machine learning models to identify PV adopters, (ii) methods to augment the data using explainable AI techniques to glean insights about key features and their interactions, and (iii) methods to generate household-level hourly solar energy output using an analytical model. The resulting synthetic datasets are validated using real-world data and can serve as a digital twin for modeling downstream tasks. Finally, a policy-based case study utilizing the digital twin for Virginia demonstrated increased rooftop solar adoption with the 30\% Federal Solar Investment Tax Credit, especially in Low-to-Moderate-Income communities.

Paper Structure

This paper contains 7 sections, 10 equations, 17 figures, 9 tables, 3 algorithms.

Figures (17)

  • Figure 1: U.S. and rooftop solar adoption. (a) Different combinations of spatial (household, census tract, county, state and U.S.) and temporal (hourly, daily, monthly, yearly) resolutions possible using the solar energy generation model developed in this work. (b) U.S. county-level solar adoption rate choropleth map in the synthetic population. Each county is shaded with the color intensity reflecting the adoption rate. The total solar adoption in each county has been normalized with respect to the number of households in that county. This normalization allows for a more accurate representation of solar adoption rates, as it accounts for variations in county population sizes. The varying intensities of color represent geographical disparities in solar energy uptake across the country. California stands out from other states, exhibiting a significantly higher rate of solar adoption. The map also provides insights into regional trends where the states in the West lead in solar adoption, followed by the Northeast. In addition, the map indicates that the South lags behind the West regarding solar adoption.
  • Figure 2: Spatial and temporal analysis of solar energy production for WA, VA, ID, LA, and MA in the synthetic population. (a) Monthly solar energy production: Each pie chart shows the distribution of solar energy generated by all the households across five selected states for each month. It is divided into twelve segments, each corresponding to a month of the year. (b) Hourly aggregate solar energy production by season: The line graph presents the aggregate solar energy produced in each hour by rooftop solar panels for each season. Each data point on the graph represents the total energy produced during a specific hour, aggregated over an entire season. The x-axis indicates the hour of the day with respect to their specific time zones, while the y-axis denotes the hourly-seasonal aggregate solar energy produced, measured in megawatt-hours (MWh). The visualization offers insights into the geographic and temporal fluctuations in residential solar energy generation, reflecting the impact of regional climatic conditions and other environmental factors.
  • Figure 3: Comparative Analysis of State-Level Models at global level using beeswarm plots. The plots illustrate the SHAP values for various features, arranged on the y-axis according to their importance. The x-axis displays the SHAP values, with color intensity varying from blue to magenta to represent feature values from low to high. Points cluster where data concentration is highest. (a) VA and NC: This figure presents a side-by-side comparison of bee swarm plots for VA and NC. (b) TX and NY: This figure provides a side-by-side comparison of beeswarm plots for TX and NY.
  • Figure 4: Validation of solar adoption and PV generation synthetic datasets.(a) Comparison of synthetic solar adopters with the DeepSolar and LBNL solar dataset across U.S. states. The x-axis represents the contiguous states in the U.S., while the y-axis denotes the number of solar adopters in log scale. (b) Average correlation between hourly load curves of synthetic households and Pecan Street households. The x-axis represents the month, and the y-axis shows the average correlation calculated using mean Pearson correlation coefficients. The data consistently exhibits a high positive correlation across all months for both TX and NY. (c) Comparison of the daily average solar generation distribution between Pecan Street dataset for Austin, TX and synthetic solar generated dataset for TX. Pecan Street data is depicted by the solid blue curve. The solid red curve illustrates the mean of the synthetic data, and the red dotted curves indicate the standard deviation of the synthetic dataset. (d) Comparison of the distribution between Pecan street dataset for NY and synthetic solar generated dataset. Pecan Street data is depicted by the solid blue curve. The solid green curve illustrates the mean of the synthetic data, and the green dotted curves indicate the standard deviation of the synthetic dataset.
  • Figure 5: Policy impacts on rooftop solar adoption in VA synthetic population.(a) Comparison of different cases in total solar adoption across VA. The line plot shows adopter counts for seven policies (Cases 1a, 1b, 2a, 2b, 3, 4, and 5) over 10 time steps, with each line depicting a different policy's impact on adoption rates. The x-axis indicates time steps, and the y-axis represents adopter counts. The plot reveals how each policy influences adoption patterns, with Cases 4, 1b, and 5 showing distinct trajectories compared to the similar patterns of Cases 1a, 2a, 2b, and 3. (b) Bar chart of LMI solar adoption in VA's rural and urban areas under various cases. The rural and urban LMI population is around 31.7% and 68.3%, respectively, of the total LMI population. The x-axis lists the cases, and the y-axis shows total LMI solar adoption, with blue for rural and orange for urban areas. Equal opportunity policies (Cases 4 and 1b) show similar rural adoptions, with targeted policies (Cases 2b and 5) following. Case 4 leads in urban adoption, highlighting the 30% tax credit's effectiveness in enhancing urban LMI penetration.
  • ...and 12 more figures

Theorems & Definitions (2)

  • definition 1
  • definition 2