Why wait to process data in hourly or daily batches when you can have correct results now? Streaming technologies have reached a level of maturity sufficient for mainstream adoption. With this practical book, data engineers, data scientists, and developers will learn how to work with streaming data in a conceptual and platform-agnostic way.
This handy pocket reference explains the what, where, when, and how of processing real-time data streams. You'll learn: Core principles and concepts behind robust out-of-order data processing Strategies for choosing data processing windows How watermarks track progress and completeness in infinite datasets How the concepts of streams and tables form the foundations of both batch and streaming data processing How time-varying relations provide a link between stream processing and the world of SQL and relational algebra Modern technologies used in the streaming ecosystem
For a more detailed look at stream processing, check out O'Reilly's Streaming Systems by Tyler Akidau, Slava Chernyak, and Reuven Lax.