
Abstract: This talk will feature a deep dive on the state of analysis-ready data from open data portals from the U.S. Government (data.gov) and beyond in relation to beer statistics and analytics. This talk will share resources and examples on how to navigate open data portals, collecting & web scrapping data from government websites, and advocating for data in usable formats for data scientists (think csv not pdf). Case studies in this talk will utilize popular R packages such as rvest, dplyr, ggplot2, and ttbbeer; a R data package filled with pre-processed beer statistics data from the U.S. Department of Treasury, developed by the speaker Jasmine Dumas.
Bio: Jasmine Dumas is a Data Scientist at Simple Finance where she is helping people feel confident with their money by making banking beautiful. She has a background in biomedical engineering, aerospace manufacturing, and product design & development. She is an active member of the R programming community with two published packages on the Comprehensive R Archive Network (CRAN) which focuses on increasing access to analysis-ready data (ttbbeer) and developing user-friendly interfaces (shinyLP). She enjoys developing open source tools to enable reproducible research and has contributed by participating in Google Summer of Code (Summer 2015) and NASA Datanauts (Spring 2017).