An Introduction to Data Wrangling with SQL


Data wrangling is an essential foundational topic for anyone considering a role in data engineering, data science, or machine learning. This session will help you understand core data wrangling concepts including what is data, data generation and collecting, data cleaning, profiling, transformation, and other essential data wrangling topics. As this is an interactive training session, in addition to covering these topics, we will layer on hands-on SQL training and an introduction to relational databases. As we journey through the data workflow we will use SQL to wrangle and transform the data as needed. SQL consistently makes the top 5 job requirements list for data scientists, data analysts, machine learning engineers, and other related data roles. The SQL standard is the universal go-to tool for manipulating structured data stores including relational databases. With this foundational understanding, you will not only have job-ready data skills but you will be better position to proceed to other introductory-level courses in data analysis, programming, data science, and machine learning at ODSC East 2023

Session Outline:

Lesson 1: Introduction
Introduction Data Wrangling & SQL
Data Wrangling Tools
Why use SQL?

Lesson 2: The Data Life Cycle
Understanding the Data Life Cycle
What is Data?
Understanding Data Types
Structured and Unstructured Data
Data Collection and Sourcing

Lesson 3: Relational Databases
Popular Relational Databases
Understanding Tables and Databases
Relational Databases

Lesson 4: Using SQL - Hands-on
Introduction to the SQL Syntax
Your first SQL Query Statement
Filtering Data with SQL

Lesson 5: Data Profiling - Hands-on
Understanding Data Profiling
Data Profiling with SQL
How to Identify Outliers
Working with Outliers with SQL
Understand Correlations with SQL
Finding Data Issues with SQL

Lesson 6: Databases and Tables - Hands-on
Introducing Servers and Clients
Understand a Database Schema, an Example
Introduction to Data Types
Creating Tables with SQL
How to Insert Data into Tables

Lesson 7: Data Preperation - Hands-on
Introduction to Data Preperation
Data Transformation with SQL
Avoiding Unintended Consequences with SQL Transaction Control
Using SQL Subqueries
Data Transformation and Data Enrichment
Unser Stand Normalizatoin

Lesson 8: Data Shaping - Hands-on
Data Shaping Examples
Data Shaping with SQL Group Command
Data Shaping with SQL JOINS
Understand SQL Joins
Modifying Data with SQL Update Command

Lesson 9: Data Wrangling for Data Visualization
Why Prepare Data for Visualization?
Pivoting Data with SQL Crosstab
Understanding the SQL Lag Command


Sheamus McGovern is the founder of ODSC (The Open Data Science Conference). He is also a software architect, data engineer, and AI expert. He started his career in finance by building stock and bond trading systems and risk assessment platforms and has worked for numerous financial institutions and quant hedge funds. Over the last decade, Sheamus has consulted with dozens of companies and startups to build leading-edge data-driven applications in finance, healthcare, eCommerce, and venture capital. He holds degrees from Northeastern University, Boston University, Harvard University, and a CQF in Quantitative Finance.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google