Powering Millions of Real-time Decisions with Distributed Model Serving


Millions of critical real-time decisions are made each day by online Machine Learning models at Lyft to shape how riders move and how drivers earn. To enable these decisions efficiently at scale, we grappled with several technical challenges:

(1) How could we design a serving system that can perform model inferences within single digit millisecond latencies and a throughput of 1,000,000+ requests per second?

(2) How would we make such a system support model sizes from low kilobytes to gigabytes and with model update periods as fast as a couple of minutes?

(3) How can we empower 40+ teams with use-cases across fraud detection, pricing, safety, ETAs, etc. to use any modeling libraries possible so that they can ship effective models fast, with no constraints?

We built LyftLearn Serving, a scalable, flexible, distributed online model serving system to overcome these challenges. In this talk, we give an overview of the online model serving requirements at Lyft that drove us to build LyftLearn Serving. We showcase various techniques we used to tackle the aforementioned challenges to achieve a low latency, high throughput model serving system powering products of 40+ teams. We will also present design decisions we made for LyftLearn Serving for efficient versioning, deploying, testing, and monitoring ML models and describe tradeoffs that would help and inspire ML Ops practitioners while building similar systems.


Hakan is a staff software engineer in ML Platform team at Lyft. They build ML development, training and serving systems helping 40+ teams. Previously, Hakan was a staff engineer in Box. He helped build cloud content management applications focused on security and also scaled kubernetes clusters, service meshes in an on-premise infrastructure. He started his career at the hardware level, building ASICs and transitioned to distributed systems software in a startup experience. Hakan is passionate about wearing many hats, switching abstraction levels, operational excellence and mentorship, and loves challenges and solving problems that take the whole team to address.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google