Visualization throughout the data science workflow: Why it’s useful, and how not to lie

Abstract: Scientists, data scientists included, want to believe that "the data speaks for itself," focusing on numerical methods and analyses over visualization and communication. However, data visualization is an essential tool in a data scientist's toolbox. Data visualization allows you to see patterns that would be invisible - or inefficiently visible - from looking at numbers alone. These patterns help make sense of data, facilitating good decision-making throughout the data science workflow.

In this talk, we will step beyond extolment of the importance of data visualization to walk through some accessible examples of two main concepts: (1) where visualization serves its purpose in the data science workflow (thus visually demonstrating the importance of visualization!), and (2) some oft-overlooked considerations in choosing and designing visualizations. The goal is to give participants new ideas of how to integrate visualization into their data science workflows, and a better understanding of how to make visualization design decisions that will facilitate clear and effective communication.

In Part I, we will explore several main steps in the data science workflow for which data visualization can enhance efficiency, effectiveness, or intuition. We will focus on the data science process, including how data visualization can help you build better statistical and machine learning models. In Part II, we will build on Part I by discussing some aspects to consider when choosing and designing visualizations, including facets of data visualizations that can enhance or confound interpretation, and the importance of integrity in communication and design decision-making. Both parts will focus on demonstrating concepts through clear and straightforward visual examples.

Bio: Lindsay is a data scientist at T4G Limited, where she works with clients to provide the analytics and visualizations required for data-driven business decisions. Her data journey began with a MA and PhD in Biogeochemistry, from Boston University and Brown University respectively. With more than a decade of experience in research methods and the scientific process, she excels at asking incisive questions and using data to tell compelling stories, and is skilled at translating insight into impact.
Lindsay is passionate about teaching data skills to a variety of audiences, from academic and professional to amateur. Through this work, she has developed and taught workshops and online courses at the University of New Brunswick, and is a Data Carpentry instructor and Canada Learning Code chapter co-lead. When she’s not joyfully up to her knees in data, she can be found trail running, backpacking, doing yoga, cooking, or playing drums in a rock ban.

Open Data Science Conference