You're reading from Learning Apache Cassandra Build an efficient, scalable, fault-tolerant, and highly-available data layer into your application using Cassandra

Product type Paperback

Published in Feb 2015

Publisher

ISBN-13 9781783989201

Length 246 pages

Edition 1st Edition

Languages

Java

Tools

Cassandra

Concepts

Database Programming

Author (1):

Matthew Brown

View More author details

Table of Contents (14) Chapters

Preface

1. Getting Up and Running with Cassandra

2. The First Table FREE CHAPTER

3. Organizing Related Data

4. Beyond Key-Value Lookup

5. Establishing Relationships

6. Denormalizing Data for Maximum Performance

7. Expanding Your Data Model

8. Collections, Tuples, and User-defined Types

9. Aggregating Time-Series Data

10. How Cassandra Distributes Data

A. Peeking Under the Hood

B. Authentication and Authorization

Index

Creating a keyspace

A keyspace is a collection of related tables, equivalent to a database in a relational system. To create the keyspace for our MyStatus application, issue the following statement in the CQL shell:

CREATE KEYSPACE "my_status"
WITH REPLICATION = {
  'class': 'SimpleStrategy', 'replication_factor': 1
};

Here we created a keyspace called my_status, which we will use for the remainder of this book. When we create a keyspace, we have to specify replication options. Cassandra provides several strategies for managing replication of data; SimpleStrategy is the best strategy as long as your Cassandra deployment does not span multiple data centers. The replication_factor value tells Cassandra how many copies of each piece of data are to be kept in the cluster; since we are only running a single instance of Cassandra, there is no point in keeping more than one copy of the data. In a production deployment, you would certainly want a higher replication factor; 3 is a good place to start.

Note

A few things at this point are worth noting about CQL's syntax:

It's syntactically very similar to SQL; as we further explore CQL, the impression of similarity will not diminish.
Double quotes are used for identifiers such as keyspace, table, and column names. As in SQL, quoting identifier names is usually optional, unless the identifier is a keyword or contains a space or another character that will trip up the parser.
Single quotes are used for string literals; the key-value structure we use for replication is a map literal, which is syntactically similar to an object literal in JSON.
As in SQL, CQL statements in the CQL shell must terminate with a semicolon.

Selecting a keyspace

Once you've created a keyspace, you would want to use it. In order to do this, employ the USE command:

USE "my_status";

This tells Cassandra that all future commands will implicitly refer to tables inside the my_status keyspace. If you close the CQL shell and reopen it, you'll need to reissue this command.