Loading…
Reactive Summit 2020 has ended
Tuesday, November 10 • 15:15 - 15:45
Tale of Stateful Stream to Stream Processing

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Streaming engines like Apache Flink are redefining how we process data. Flink provides the opportunity to extract, transform, and write data with ease matching that of batch data processing frameworks. There are plenty of known and proven use cases of how to convert a single batch job into a streaming job. However, there are quite many challenges when we want to convert a stateful end-to-end batch workflow to multiple stateful stream jobs. Netflix processes payment for 180M+ members across 190 countries. Payment processing and transaction data is very critical for measuring operational health and performance of our payments platform. We decided to move the existing batch workflow completely to stream. Things started to get exciting when we wanted to introduce multiple streaming jobs with zero data loss and high accuracy. In this talk, we describe how we converted a conventional complex stateful batch workflow to a multi-step stateful streaming workflow at Netflix using Flink. You’ll learn about 1)Design and architecture involving multiple stateful streaming jobs 2)Managing schema evolution using Avro for stateful real-time applications 3)Sharing code between Flink and Spark for any fallback batch processing. 4) Handling cascading impact when events arrive out of order 5) Landing processed data in real-time into multiple sinks such as Iceberg and Druid.

Speakers
avatar for Ajit Koti

Ajit Koti

Senior Engineer, Netflix
Ajit Koti is a Senior Engineer on the Growth Data Engineering team at Netflix, building and architecting large-scale distributed systems and real-time data processing engines.Ajit has worked previously at Fanatics, IBM Labs, J P Morgan and has extensive experience in building distributed... Read More →


Tuesday November 10, 2020 15:15 - 15:45 PST
streams