Building and Operating Cloud Native Analytics Systems at Scale


This 90 minute deep dive introduces you to key patterns and techniques for building and operating cloud native data & analytics systems to reliably feed your mission critical analytics needs at scale. Learn why containerized applications are central to building reliable, scalable, manageable, and maintainable systems at scale, and how to architect composable applications to unlock the massive potential available when all of the pieces just seem to fit.

Session Outline
1. Intro: Components of the Cloud Native "Modern" Data Platform
Learn what critical components make up the modern data ecosystem. This section will lay a foundation for the rest of the session, and introduce the essential vocabulary and ideologies so beginners and seasoned veterans alike can follow the big picture.

2. Design Patterns: Containerized Data Applications
Understand why containers are essential to the modern data ecosystem, and why they enable revolutionary capabilities when operating distributed data products and high volume data ecosystems (platforms / pipelines / etc). Learn how to take a layered approach towards building consistency and reliability into your data strategy to overcome what may at times feel like a turbulent sea of many moving parts, across an ever expanding ocean of data needs.

3. Just Roll with It: Orchestration & Auto-Scaling Data Applications with Kubernetes
The majority of the session (45 minutes) will dive into the orchestration process and showcase how composable cloud-native data applications can take flight and grow or shrink elastically to handle your use cases of today, as well as the use cases of tomorrow. Learn to accept the unknowns, and deploy applications that can ride any data wave you throw at them.

Background Knowledge
This session can be appreciated by all levels. A background in systems design, linux, networking, distributed systems and data intensive systems can help to fill in the gaps the session won't have time to cover. Basic understanding of Kubernetes (K8s) is also a nice to have.


Scott Haines is a full-stack engineer with a current focus on real-time, highly available, trustworthy analytics systems. He is currently working at Twilio as a Software Architect and previously worked as Principal Software Engineer on the Voice Insights team, where he helped drive spark adoption and streaming pipeline architectures, and build out a massive stream-processing platform.

Prior to Twilio, he worked on writing the backend Java APIs for Yahoo! Games, as well as the real-time game ranking/rating engine (built on Storm) to provide personalized recommendations and page views for 10 million customers. He finished his tenure at Yahoo! working for Flurry Analytics where he wrote the alerts/notifications system for mobile.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google