
Abstract: Climate change has already had a major impact across the United States. In the West, we are experiencing less winter snowpack accumulation, longer springs and hotter summers. This coupled with a hundred year’s of fire suppression creates a perfect environment for the wildfire to spread. Both operationally and to inform public policy, we need physical spread models, as well as impact assessment models built on them. However conventional approaches are generally coarse in scale, out of date, and error-prone.
The most important physical variable is the accumulation of brush and forest debris beneath forest canopy - but this is difficult to detect with 30-60m pixels. Fuel moisture is similarly important and highly variable in space and time - but typically measured in only at a few dozen weather stations per region. The locations of people and infrastructure are clearly primary socioeconomic variables. Yet conventional models treat all people as identical and don’t map anything much smaller than a census block. This session explores how modern data science techniques and GPU-accelerated analytics can be combined with better data to improve outcomes.
Taking California as a case study, we start with disaggregate data from a Microsoft ML model, providing a building footprint for each building in the state (10 million in total). We then consider forest structure data from the California Forest Observatory, also generated using ML models. At 10m/pixel, this amounts to 400m pixels per time slice (4) per variable (5). To concentrate on wildfire risk for households, we use the fire science concept of “defensible space” and buffer buildings to extract surrounding forest conditions. Over the last four years, California has experienced some devastating fires, and their perimeters are available as open data. Lastly, we resample census demographic data at the household level, to get estimates of total population and subsets of that population with special needs.
What can we learn from this data? First, we deploy unsupervised learning techniques and in particular cluster analysis. When applied to the biophysical data, this supports the classification of the ‘defensible space’ surrounding millions of buildings. When we add the social and fire history data, we can further characterize the relationships between people, landscape management, and outcomes. Second, when we apply supervised techniques such as Random Forests, we can explore the factors correlated with particular outcomes.
Bio: Dr. Michael Flaxman is OmniSci’s Product Lead. In addition to leading product strategy at OmniSci, Dr. Flaxman focuses on the combination of geographic analysis with machine learning, or “geoML.” He has served on the faculties of MIT, Harvard and the University of Oregon. Dr. Flaxman has participated in GIS projects in 17 countries. He has been a Fulbright fellow, and served as an advisor to the Interamerican Development Bank, the World Bank and the National Science Foundation. Dr. Flaxman previously served as industry manager for Architecture, Engineering and Construction at ESRI, the world’s largest developer of GIS technology. Dr. Flaxman received his doctorate in design from Harvard University in 2001 and holds a master’s in Community and Regional Planning from the University of Oregon and a bachelor’s in biology from Reed College.