R Tools for Data Science

Abstract: R is rapidly evolving to accommodate the needs of data scientists. Through the mechanism of R’s package system, developers are shaping the language with tools that integrate R’s inferential capabilities into workflows designed to support enterprise-wide data science. These tools include efficient interfaces to databases, new language constructs for data manipulation, the extension of machine learning algorithms to time-to-event data, projects that facilitate large scale feature engineering, new approaches to working with spatial data and visualizations, integrated connections to distributed Machine Learning platforms such as Spark and TensorFlow, Javascript-based, interactive visualizations, the use of markdown to construct reproducible workflows, and communication platforms such as Shiny for sharing results in a way that enables non-programmer stakeholders to develop their own insights. In this talk, I will briefly describe those features of R that are enabling the transformation of the language, and present concrete examples of the tools mentioned above. My hope is that the audience will come away with a big picture view of R as a Data Science platform, and also see how to follow-up and work with the tools presented.

Bio: Joseph is RStudio’s “Ambassador at Large” for all things R, editor of the R Views blog, and RStudio’s representative on the R Consortium’s board of directors. Joseph came to RStudio via Revolution Analytics and Microsoft where he was a data scientist, blogger and community manager. Joseph began his engagement with R as a graduate student in 2004 and deepened his appreciation for the language working as a statistician for a small healthcare economics firm. Previously, Joseph held a variety of technical, marketing and sales positions while working for technology startups that spanned multiple industries including government contracting, local area networks, disk drives and test equipment. Joseph studied Classics and Mathematics as an undergraduate at Franklin & Marshall College and earned an M.A. in Humanities and an M.S. in Statistics from the California State University. Joseph’s main blogging interest is to tell the story of R and the vibrant, world-wide community that makes R happen.