Most streaming engines focus on performing computations on a stream: for example, one can map a stream to run a function on each record, reduce it to aggregate events by time, etc. However, as we worked with users, we found that virtually no use case of streaming engines only involved performing computations on a stream. Instead, stream processing happens as part of a larger application, which we’ll call a continuous application.
Online machine learning and serving real-time data are examples that show streaming computations are part of larger applications that include serving, storage, or batch jobs. Unfortunately, in current systems, streaming computations run on their own, in an engine focused just on streaming. This leaves developers responsible for the complex tasks of interacting with external systems (e.g. managing transactions) and making their result consistent with the the rest of the application (e.g., batch jobs). This is what we’d like to solve with continuous applications.