Google Dataflow: The new open model for batch and stream processing

Robert Kubis

In 2004 MapReduce was introduced, a model that kick-started big data. 10 years later, Google published Dataflow - a new paradigm, integrating batch and stream processing in one common abstraction. This time it was more than a paper, but also an open source Java SDK and a cloud managed service to run it. In 2016 Dataflow was proposed for incubation at the Apache Software Foundation - Beam was born, unifying batch and streaming, and also the big data world. We’ll demonstrate Dataflow’s capabilities through a real-time demo with practical insights on how to manage and visualize streams of data.

Language: English

Level: Beginner

