Abstract: Deploying ML in production is hard, and data is often the hardest part. Unlike analytics pipelines, production ML pipelines need to process both historical data for training, and fresh data for online serving, often using streaming or real-time data sources. They must ensure training/serving parity, provide point-in-time correctness, and serve data with production service levels. These challenges are difficult to tackle with traditional data orchestration tools, and can often add weeks or months to project timelines.
In this session, Willem Pienaar will discuss why his team built the Feast feature store at Gojek, and how it enabled Gojek to scale its ML initiatives quickly and efficiently. Feast has since been open sourced and is now being used by hundreds of organizations, including Twitter, Shopify, and Robinhood. Willem will explain how Feast is typically being used in an enterprise context, with an overview of use cases, architectural patterns, and the value delivered.
To solve the broader data problem for ML, Willem will share his vision for delivering a complete feature platform. A feature platform includes a feature store, but also provides automated data pipelines that transform data from batch and real-time sources. The feature platform is designed to manage the complete lifecycle of real-time ML features, and is the next logical phase in the evolution of data tooling for operational ML.
Attendees will learn about the specific problems that Feast solves to enable Operational ML. Willem will show how Feast can be used to:
- Define features as code
- Store values in offline and online store
- Serve data for training
- Serve data online for real-time inference
- Transform data and materialize feature values in real-time
- Monitor pipeline health, data drift, and online service levels
Bio: Bio Coming Soon!
Principal Data Engineer | Tecton