Packt+ | Advance your knowledge in tech

You're reading from Elasticsearch 5.x Cookbook Distributed Search and Analytics

Product type Paperback

Published in Feb 2017

Publisher

ISBN-13 9781786465580

Length 696 pages

Edition 3rd Edition

Languages

Java

Tools

Elasticsearch

Concepts

Enterprise Search

Author (1):

Alberto Paro

View More author details

Table of Contents (25) Chapters

Credits

About the Author

About the Reviewer

www.PacktPub.com

Customer Feedback

Dedication

Preface

1. Getting Started

2. Downloading and Setup FREE CHAPTER

3. Managing Mappings

4. Basic Operations

5. Search

6. Text and Numeric Queries

7. Relationships and Geo Queries

8. Aggregations

9. Scripting

10. Managing Clusters and Nodes

11. Backup and Restore

12. User Interfaces

13. Ingest

14. Java Integration

15. Scala Integration

16. Python Integration

17. Plugin Development

18. Big Data Integration

Understanding cluster, replication, and sharding

Related to shards management, there are key concepts of replication and cluster status.

Getting ready

You need one or more nodes running to have a cluster. To test an effective cluster, you need at least two nodes (that can be on the same machine).

How it works...

An index can have one or more replicas (full copies of your data, automatically managed by Elasticsearch): the shards are called primary ones if they are part of the primary replica, and secondary ones if they are part of other replicas.

To maintain consistency in write operations, the following workflow is executed:

The write is first executed in the primary shard
If the primary write is successfully done, it is propagated simultaneously in all the secondary shards
If a primary shard becomes unavailable, a secondary one is elected as primary (if available) and the flow is re-executed

During search operations, if there are some replicas, a valid set of shards is chosen randomly between primary and secondary to improve performances. Elasticsearch has several allocation algorithms to better distribute shards on nodes. For reliability, replicas are allocated in a way that if a single node becomes unavailable, there is always at least one replica of each shard that is still available on the remaining nodes.

The following figure shows some example of possible shards and replica configuration:

The replica has a cost to increase the indexing time due to data node synchronization and also the time spent to propagate the message to the slaves (mainly in an asynchronous way).

Best practice

To prevent data loss and to have high availability, it's good to have at least one replica; so, your system can survive a node failure without downtime and without loss of data.

A typical approach for scaling performance in search when your customer number is to increase the replica number.

There's more...

Related to the concept of replication, there is the cluster status indicator of the health of your cluster.

It can cover three different states:

Green: This state depicts that everything is ok.
Yellow: This state depicts that some shards are missing but you can work.
Red: This state depicts that, "Houston we have a problem". Some primary shards are missing. The cluster will not accept writing and errors and stale actions may happen due to missing shards. If the missing shard cannot be restored, you have lost your data.

Solving the yellow status

Mainly yellow status is due to some shards that are not allocated.
If your cluster is in "recovery" status (this means that it's starting up and checking the shards before we put them online), just wait so that the shards start up process ends.
After having finished the recovery, if your cluster is always in yellow state, you may not have enough nodes to contain your replicas (because, for example, the number of replicas is bigger than the number of your nodes). To prevent this, you can reduce the number of your replicas or add the required number of nodes.

Note

The total number of nodes must not be lower than the maximum number of replicas.

Solving the red status

You have loss of data. This is when you have one or more shards missing.
You need to try to restore the node(s) that are missing. If your nodes restart and the system goes back to yellow or green status, you are safe. Otherwise, you have lost data and your cluster is not usable: delete the index/indices and restore them from backups or snapshots (if you have already done it) or from other sources.

To prevent data loss, I suggest having always at least two nodes and a replica set to 1.