At DoorDash we process 6 billion messages day across 2500 Real-Time pipelines in total. Our pipelines ingest data from mobile devices and internal services, stream-process the data with Apache Flink and Kafka before writing to Datalake(s). We use Trino, Pinot and other tools to query from Datalake(s).<br><br>Along the way, we have built a rich set of automation tools to maintain the lifecycle of these pipelines from provisioning to clean-up. The lessons learnt as our business and data-needs scale guide us in both improving our pipelines and the related-automation.<br><br>We hope to share our learnings with the broader Stream Processing community.

Software Engineer in Data Platform

Software Engineer

Automating Real-Time Pipelines at Scale in DoorDash

All sessions

Keynotes

Sessions

Panels

Activities

Training

Workshop

All topics

Architecture

Internals

Real-Time Analytics

Use Case

Agenda

Speakers

F.A.Q.

Register now

RTA Summit

Agenda

Automating Real-Time Pipelines at Scale in DoorDash

Varun Narayanan Chakravarthy

Basar Hamdi Onat

Chen Yang