Packt+ | Advance your knowledge in tech

You're reading from Practical Real-time Data Processing and Analytics Distributed Computing and Event Processing using Apache Spark, Flink, Storm, and Kafka

Product type Paperback

Published in Sep 2017

Publisher Packt

ISBN-13 9781787281202

Length 360 pages

Edition 1st Edition

Languages

Processing

Tools

Apache Spark

Concepts

Data Analysis

Authors (2):

Shilpi Saxena

Saurabh Gupta

View More author details

Table of Contents (14) Chapters

Preface

1. Introducing Real-Time Analytics FREE CHAPTER

2. Real Time Applications – The Basic Ingredients

3. Understanding and Tailing Data Streams

4. Setting up the Infrastructure for Storm

5. Configuring Apache Spark and Flink

6. Integrating Storm with a Data Source

7. From Storm to Sink

8. Storm Trident

9. Working with Spark

10. Working with Spark Operations

11. Spark Streaming

12. Working with Apache Flink

13. Case Study

Flink persistence

Flink provides a connector with the sinks or persistences, such as:

Apache Kafka
Elasticsearch
Hadoop Filesystem
RabbitMQ
Amazon Kinesis Streams
Apache NiFi
Apache Casssandra

In this book, we will discuss the Flink and Cassandra connection as it is the most popular.

Integration with Cassandra

We have discussed and explained the setup of Cassandra in previous chapters so we will directly go to the program required to make a connection between Flink and Cassandra:

Add dependencies in pom.xml:

<dependency>
    <groupId>org.apache.flink</groupId>
    <artifactId>flink-connector-cassandra_2.11</artifactId>
    <version>1.2.0</version>
</dependency>
<dependency>
    <groupId>com.codahale.metrics</groupId>
    <artifactId>metrics-json</artifactId>
    <version>3.0.2</version>
</dependency>

Create the data stream:

DataStream<Tuple4<Long,Integer,Integer,Long>> messageStream = env.addSource...

The rest of the chapter is locked

Tech Concepts

Programming languages

Tech Tools

Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.

Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.

50+ new titles added per month and exclusive early access to books as they are being written.

You're reading from Practical Real-time Data Processing and Analytics Distributed Computing and Event Processing using Apache Spark, Flink, Storm, and Kafka

Table of Contents (14) Chapters

Flink persistence

Integration with Cassandra

Authors (2)

Other recommended products

Personalised recommendations for you

You're reading from Practical Real-time Data Processing and Analytics Distributed Computing and Event Processing using Apache Spark, Flink, Storm, and Kafka

Table of Contents (14) Chapters

Flink persistence

Integration with Cassandra

Authors (2)

Other recommended products

Personalised recommendations for you

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access