You're reading from Learning Real-time Analytics with Storm and Cassandra Solve real-time analytics problems effectively using Storm and Cassandra

Product type Paperback

Published in Mar 2015

Publisher

ISBN-13 9781784395490

Length 220 pages

Edition 1st Edition

Languages

Java

Tools

Storm

Concepts

Data Processing

Author (1):

Shilpi Saxena

View More author details

Table of Contents (14) Chapters

Preface

1. Let's Understand Storm

2. Getting Started with Your First Topology FREE CHAPTER

3. Understanding Storm Internals by Examples

4. Storm in a Clustered Mode

5. Storm High Availability and Failover

6. Adding NoSQL Persistence to Storm

7. Cassandra Partitioning, High Availability, and Consistency

8. Cassandra Management and Maintenance

9. Storm Management and Maintenance

10. Advance Concepts in Storm

11. Distributed Cache and CEP with Storm

A. Quiz Answers

Index

A high-level view of various components of Storm

In this section, we will get you acquainted with various components of Storm, their role, and their distribution in a Storm cluster.

A Storm cluster has three sets of nodes (which could be co-located, but are generally distributed in clusters), which are as follows:

Nimbus
Zookeeper
Supervisor

The following figure shows the integration hierarchy of these nodes:

A high-level view of various components of Storm

The detailed explanation of the integration hierarchy is as follows:

Nimbus node (master node, similar to Hadoop-JobTracker): This is the heart of the Storm cluster. You can say that this is the master daemon process that is responsible for the following:
- Uploading and distributing various tasks across the cluster
- Uploading and distributing the topology jars jobs across various supervisors
- Launching workers as per ports allocated on the supervisor nodes
- Monitoring the topology execution and reallocating workers whenever necessary
- Storm UI is also executed on the same node
Zookeeper nodes: Zookeepers can be designated as the bookkeepers in the Storm cluster. Once the topology job is submitted and distributed from the Nimbus nodes, then even if Nimbus dies the topology would continue to execute because as long as Zookeepers are alive, the workable state is maintained and logged by them. The main responsibility of this component is to maintain the operational state of the cluster and restore the operational state if recovery is required from some failure. It's the coordinator for the Storm cluster.
Supervisor nodes: These are the main processing chambers in the Storm topology; all the action happens in here. These are daemon processes that listen and manage the work assigned. These communicates with Nimbus through Zookeeper and starts and stops workers according to signals from Nimbus.

The rest of the chapter is locked

You're reading from Learning Real-time Analytics with Storm and Cassandra Solve real-time analytics problems effectively using Storm and Cassandra

Table of Contents (14) Chapters

A high-level view of various components of Storm

Authors (1)

Personalised recommendations for you

You're reading from Learning Real-time Analytics with Storm and Cassandra Solve real-time analytics problems effectively using Storm and Cassandra

Table of Contents (14) Chapters

A high-level view of various components of Storm

Unlock this book and the full library FREE for 7 days

Authors (1)

Personalised recommendations for you