Managing Data Projects Like a Software Engineer
Managing Data Projects Like a Software Engineer


In this talk we’ll go over how to write code that is reproducible and easy for other people to work with.

We’ll start by talking about virtual environments. Virtual environments allow you to define the dependencies for your projects (such as NumPy or Matplotlib) and to keep these dependencies separated between projects. We’ll also outline some choices you have about how to manage your virtual environments.

Next we’ll talk about version control and why you should be using it even if you’re the only contributor to a project. Version control helps create a log of what work was done and why, and will give you the ability to go back when you inevitably make a change to your project that you can’t figure out how to undo.

Then we’ll discuss project structure by reviewing DrivenData’s Cookiecutter Data Science template. The template encourages a number of best practices, and makes it so that anyone familiar with the template will be able to look at your code for the first time will be reasonably well oriented.

Finally, we’ll briefly cover why you should establish coding styles and always use a linter.


Michael is a data engineer at Amazon in San Diego. He works in the Buyer Risk Prevention team, whose mission is to keep Amazon stores safe and trustworthy by protecting customer accounts from takeover, fraud, and abuse. Before joining a “big tech” company he worked on an enterprise data warehouse migration project at Petco, and helped build the data science team at a startup called Classy. He’s most passionate about doing work that makes a positive impact and helps give everyone in the world equal opportunity to do what they love.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google