Stream grouping

The following are different types of grouping available with Storm:

  • Shuffle grouping: Shuffle grouping distributes tuples equally across the tasks. An equal number of tuples are received by all tasks.
  • Field grouping: In this grouping, tuples are sent to the same bolt based on one or more fields, for example, in Twitter if we want to send all tweets from the same tweet to the same bolt then we can use this grouping.
  • All grouping: All tuples are sent to all bolts. Filtering is one operation where we need all grouping.
  • Global grouping: All tuples send a single bolt. Reduce is one operation where we need global grouping.
  • Direct grouping: The producer of the tuple decides which of the consumer's task will receive the tuple. This is possible for only streams that are declared as direct streams.
  • Local or shuffle grouping: If the source and target bolt are running in the same worker process then it is local grouping, as no network hops are required to send the data across the network. If this is not the case, then it is the same as shuffle grouping.
  • Custom grouping: You can define your own custom grouping.