书名：Practical Real-time Data Processing and Analytics
作者名：Shilpi Saxena Saurabh Gupta
本章字数：223字
更新时间：2025-04-04 18:18:59

Stream processing

The stream processing component itself consists of three main sub-components, which are:

The Broker: that collects and holds the events or data streams from the data collection agents
The Processing Engine: that actually transforms, correlates, aggregates the data, and performs other necessary operations
The Distributed Cache: that actually serves as a mechanism for maintaining common datasets across all distributed components of the Processing Engine

The same aspects of the stream processing component are zoomed out and depicted in the diagram that follows:

There are a few key attributes that should be catered for by the stream processing component:

Distributed components thus offering resilience to failures
Scalability to cater for the growing needs of an application or sudden surge of traffic
Low latency to handle the overall SLAs expected from such applications
Easy operationalization of a use case to be able to support evolving use cases
Built for failures, the system should be able to recover from inevitable failures without any event loss, and should be able to reprocess from the point it failed
Easy integration points with respect to off-heap/distributed cache or data stores
A wide variety of operations, extensions, and functions to work with the business requirements of the use case

These aspects are basically considered while identifying and selecting the stream processing application/framework for a real-time use case implementation.