
Abstract: Feature engineering is supposed to be an iterative process, transforming raw data into training examples and feature vectors. Iteration is key -- but, each cycle should include trying new ideas offline, as well as testing in production.
Offline experimentation requires historical event-based data to compute training examples at the right points-in-time—quickly, without waiting for complex pipelines to be built just to determine if a feature will be useful. Then, in the latter part of each iteration cycle, we need to test the new model live—without worrying about offline and online discrepancies.
Feature stores are the newest idea that is supposed to help us, but it turns out that’s not enough. In this session, you’ll learn how to craft production-ready features and build training datasets at the right points-in-time from event-based data. Specifically, we’ll be covering strategies for powering feature stores with a feature engine to:
- Compute directly from event-based data to try new features
- Iterate on feature definitions and time selection across historical data instantly
- Join values between different entities at precise times — without leakage
- Eliminate data discrepancies in production
Come join us to learn how to finally iterate on amazing ML models with event-based data.
Bio: Dr. Charna Parkey is Vice President of Product at Kaskada, where she co-created the first commercially available feature engine with time travel. She has over 15 years’ experience in enterprise data science and adaptive algorithms in the defense and startup tech sectors and has worked with dozens of Fortune 500 companies in her work as a data scientist. She earned her Ph.D. in Electrical Engineering at the University of Central Florida.