(Machine) Learning to Live with Wildfires – Mitigating Risks of Climate Change with Accelerated Analytics


Climate change has already had a major impact across the United States. In the West, we are experiencing less winter snowpack accumulation, longer springs and hotter summers. This coupled with a hundred year’s of fire suppression creates a perfect environment for the wildfire to spread. Both operationally and to inform public policy, we need physical spread models, as well as impact assessment models built on them. However conventional approaches are generally coarse in scale, out of date, and error-prone.

The most important physical variable is the accumulation of brush and forest debris beneath forest canopy - but this is difficult to detect with 30-60m pixels. Fuel moisture is similarly important and highly variable in space and time - but typically measured in only at a few dozen weather stations per region. The locations of people and infrastructure are clearly primary socioeconomic variables. Yet conventional models treat all people as identical and don’t map anything much smaller than a census block. This session explores how modern data science techniques and GPU-accelerated analytics can be combined with better data to improve outcomes.

Taking California as a case study, we start with disaggregate data from a Microsoft ML model, providing a building footprint for each building in the state (10 million in total). We then consider forest structure data from the California Forest Observatory, also generated using ML models. At 10m/pixel, this amounts to 400m pixels per time slice (4) per variable (5). To concentrate on wildfire risk for households, we use the fire science concept of “defensible space” and buffer buildings to extract surrounding forest conditions. Over the last four years, California has experienced some devastating fires, and their perimeters are available as open data. Lastly, we resample census demographic data at the household level, to get estimates of total population and subsets of that population with special needs.

What can we learn from this data? First, we deploy unsupervised learning techniques and in particular cluster analysis. When applied to the biophysical data, this supports the classification of the ‘defensible space’ surrounding millions of buildings. When we add the social and fire history data, we can further characterize the relationships between people, landscape management, and outcomes. Second, when we apply supervised techniques such as Random Forests, we can explore the factors correlated with particular outcomes.


Abhishek Damera’s work as a Data Scientist at OmniSci involves using the state of art machine learning algorithms to capture the underlying trends in the geospatial data. Prior to this, he has done his Master’s at UC Berkeley in Transportation Engineering, where most of his work is focused on classifying the roads according to vehicular speed profiles.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google