Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Mastering Elasticsearch - Second Edition
Mastering Elasticsearch - Second Edition

Mastering Elasticsearch - Second Edition: Further your knowledge of the Elasticsearch server by learning more about its internals, querying, and data handling , Second Edition

eBook
$24.99 $36.99
Paperback
$60.99
Subscription
Free Trial
Renews at $19.99p/m

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Table of content icon View table of contents Preview book icon Preview Book

Mastering Elasticsearch - Second Edition

Chapter 2. Power User Query DSL

In the previous chapter, we looked at what Apache Lucene is, how its architecture looks, and how the analysis process is handled. In addition to these, we saw what Lucene query language is and how to use it. We also discussed Elasticsearch, its architecture, and core concepts. In this chapter, we will dive deep into Elasticsearch focusing on the Query DSL. We will first go through how Lucene scoring formula works before turning to advanced queries. By the end of this chapter, we will have covered the following topics:

  • How the default Apache Lucene scoring formula works
  • What query rewrite is
  • What query templates are and how to use them
  • How to leverage complicated Boolean queries
  • What are the performance implications of large Boolean queries
  • Which query you should use for your particular use case

Default Apache Lucene scoring explained

A very important part of the querying process in Apache Lucene is scoring. Scoring is the process of calculating the score property of a document in a scope of a given query. What is a score? A score is a factor that describes how well the document matched the query. In this section, we'll look at the default Apache Lucene scoring mechanism: the TF/IDF (term frequency/inverse document frequency) algorithm and how it affects the returned document. Knowing how this works is valuable when designing complicated queries and choosing which queries parts should be more relevant than the others. Knowing the basics of how scoring works in Lucene allows us to tune queries more easily and the results retuned by them to match our use case.

When a document is matched

When a document is returned by Lucene, it means that it matched the query we've sent. In such a case, the document is given a score. Sometimes, the score is the same for all the documents...

Query rewrite explained

We have already talked about scoring, which is valuable knowledge, especially when trying to improve the relevance of our queries. We also think that when debugging your queries, it is valuable to know how all the queries are executed; therefore, it is because of this we decided to include this section on how query rewrite works in Elasticsearch, why it is used, and how to control it.

If you have ever used queries, such as the prefix query and the wildcard query, basically any query that is said to be multiterm, you've probably heard about query rewriting. Elasticsearch does that because of performance reasons. The rewrite process is about changing the original, expensive query to a set of queries that are far less expensive from Lucene's point of view and thus speed up the query execution. The rewrite process is not visible to the client, but it is good to know that we can alter the rewrite process behavior. For example, let's look at what Elasticsearch...

Query templates

When the application grows, it is very probable that the environment will start to be more and more complicated. In your organization, you probably have developers who specialize in particular layers of the application—for example, you have at least one frontend designer and an engineer responsible for the database layer. It is very convenient to have the development divided into several modules because you can work on different parts of the application in parallel without the need of constant synchronization between individuals and the whole team. Of course, the book you are currently reading is not a book about project management, but search, so let's stick to that topic. In general, it would be useful, at least sometimes, to be able to extract all queries generated by the application, give them to a search engineer, and let him/her optimize them, in terms of both performance and relevance. In such a case, the application developers would only have to pass...

Handling filters and why it matters

Let's have a look at the filtering functionality provided by Elasticsearch. At first it may seem like a redundant functionality because almost all the filters have their query counterpart present in Elasticsearch Query DSL. But there must be something special about those filters because they are commonly used and they are advised when it comes to query performance. This section will discuss why filtering is important, how filters work, and what type of filtering is exposed by Elasticsearch.

Filters and query relevance

The first difference when comparing queries to filters is the influence on the document score. Let's compare queries and filters to see what to expect. We will start with the following query:

curl -XGET "http://127.0.0.1:9200/library/_search?pretty" -d'
{
    "query": {
        "term": {
           "title": {
              "value": "front"
           }
        }
   ...

Choosing the right query for the job

In our Elasticsearch Server Second Edition, we described the full query language, the so-called Query DSL provided by Elasticsearch. A JSON structured query language that allows us to virtually build as complex queries as we can imagine. What we didn't talk about is when the queries can be used and when they should be used. For a person who doesn't have much prior experience with a full text search engine, the number of queries exposed by Elasticsearch can be overwhelming and very confusing. Because of that, we decided to extend what we wrote in the second edition of our first Elasticsearch book and show you, the reader, what you can do with Elasticsearch.

We decided to divide the following section into two distinct parts. The first part will try to categorize the queries and tell you what to expect from a query in that category. The second part will show you an example usage of queries from each group and will discuss the differences. Please...

Default Apache Lucene scoring explained


A very important part of the querying process in Apache Lucene is scoring. Scoring is the process of calculating the score property of a document in a scope of a given query. What is a score? A score is a factor that describes how well the document matched the query. In this section, we'll look at the default Apache Lucene scoring mechanism: the TF/IDF (term frequency/inverse document frequency) algorithm and how it affects the returned document. Knowing how this works is valuable when designing complicated queries and choosing which queries parts should be more relevant than the others. Knowing the basics of how scoring works in Lucene allows us to tune queries more easily and the results retuned by them to match our use case.

When a document is matched

When a document is returned by Lucene, it means that it matched the query we've sent. In such a case, the document is given a score. Sometimes, the score is the same for all the documents (like for...

Left arrow icon Right arrow icon

Description

This book is for Elasticsearch users who want to extend their knowledge and develop new skills. Prior knowledge of the Query DSL and data indexing is expected.

Who is this book for?

This book is for Elasticsearch users who want to extend their knowledge and develop new skills. Prior knowledge of the Query DSL and data indexing is expected.

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Feb 27, 2015
Length: 434 pages
Edition : 2nd
Language : English
ISBN-13 : 9781783553808
Vendor :
Elastic
Category :
Languages :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Product Details

Publication date : Feb 27, 2015
Length: 434 pages
Edition : 2nd
Language : English
ISBN-13 : 9781783553808
Vendor :
Elastic
Category :
Languages :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total $ 176.97
Elasticsearch Server: Second Edition
$54.99
Mastering Elasticsearch - Second Edition
$60.99
ElasticSearch Cookbook - Second Edition
$60.99
Total $ 176.97 Stars icon
Banner background image

Table of Contents

10 Chapters
1. Introduction to Elasticsearch Chevron down icon Chevron up icon
2. Power User Query DSL Chevron down icon Chevron up icon
3. Not Only Full Text Search Chevron down icon Chevron up icon
4. Improving the User Search Experience Chevron down icon Chevron up icon
5. The Index Distribution Architecture Chevron down icon Chevron up icon
6. Low-level Index Control Chevron down icon Chevron up icon
7. Elasticsearch Administration Chevron down icon Chevron up icon
8. Improving Performance Chevron down icon Chevron up icon
9. Developing Elasticsearch Plugins Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Full star icon Half star icon 4.3
(9 Ratings)
5 star 66.7%
4 star 11.1%
3 star 11.1%
2 star 11.1%
1 star 0%
Filter icon Filter
Top Reviews

Filter reviews by




R. Somerfield Mar 24, 2015
Full star icon Full star icon Full star icon Full star icon Full star icon 5
I've been using ElasticSearch professionally for over 6 months - our first version product using ElasticSearch has shipped this week!Whilst it is fairly easy to get started with ElasticSearch, there are a lot of fundamental aspects to it (and its underpinnings) which can have a dramatic affect on using it. And, whilst there are some good examples, they tend to be fairly simplistic.Up until I got this book I'd been (extensively!!) relying on Google. And whilst I've eventually managed to work out the answers, it took a lot of searching and therefore a lot longer than I'd have ideally liked. In addition, finding individual snippets on the web doesn't help with some of the broad knowledge.I found this book to be an excellent guide to help me understand the underpinnings of ElasticSearch, and also helped me to make many improvements in my Google-aquired knowledge. I would recommend it to anyone how is spending any amount of time with ElasticSearch.
Amazon Verified review Amazon
adnan baloch Apr 27, 2015
Full star icon Full star icon Full star icon Full star icon Full star icon 5
First off, a disclaimer for newbies: This book is meant for intermediate users of the Elasticsearch Server. Still, the book begins with a short but comprehensive introduction to the basic concepts used in document indexing, the various node types in Elasticsearch and the Apache Lucene library that powers Elasticsearch under the hood. The examples in the book are based on the premise that the user is running an online bookstore which is a powerful way to explore the possibilities offered by Elasticsearch. The examples are in JSON document format which should be familiar to any serious developer. The score of a query determines how well the document matches the input query. The scoring formula is explained using asimple example to demonstrate how it works in practice. Query rewrite methods, filters and types of queries are explained in detail with a special focus on their performance impact. Simple use cases let the readers know when to use which query group. A new sandboxed scripting language called Groovy is introduced that enables on-the-fly calculation of document scores without compromising the security of the search server. Lucene expressions are also given a brief touch. Readers will enjoy the chapter on improving search suggestions which can make a real difference in the search experience of users. Plenty of examples in this chapter help to take the guesswork out of improving query relevance. Filtering garbage results and using term faceting to narrow down search results are discussed to give readers the power to tailor their websites according to their needs for maximum user satisfaction. Scaling to accommodate increasing demands requires the right amount of shards and replicas. Deciding this amount is explored with a practical routing example. The final few chapters deal with low level index control, Elasticsearch administration, performance improving techniques and developing Elasticsearch plugins. whether you have a single node or an entire cluster of nodes in the cloud, the sheer amount of information contained in over 400 pages of this book ensures that readers will find this book a worthy companion in their quest to tame and tune Elasticsearch server for blazing fast query speed and highly relevant search results.
Amazon Verified review Amazon
Massera Riccardo Apr 14, 2015
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Mastering Elasticsearch is a very well written and comprehensive book that helps professionals already working with Elasticsearch to understand how it works and how to get the most from it.This book deals with every aspect of running Elasticsearch ranging from mastering queries and indexing, to optimization, to administration, to scaling and performance tuning.The authors explain in a clear and concise manner every topic, even when they delve into the internals of Elasticsearch, showing the reader how things work under the hoods with many examples and pictures.The first part of the book is introductory and explains the general architecture of Elasticsearch, indexing, queries scoring internals and how to obtain a good user experience.One particular subject that is analysed throughout the book is the relationship of Elasticsearch with the Lucene indexing and search engine, since Elasticsearch is built on top of it.In the second part, the authors teach the most advanced topics, like sharding and index allocation, low level index control, routing, administration and scaling an Elasticsearch system, giving to the reader the keys to correctly design, dimension and evolve an efficient Elasticsearch cluster.Overall, if you are already working with Elasticsearch and you need to know how to detect and handle performance issues or to improve user experience or to scale your Elasticsearch environment, this is an excellent book to read.
Amazon Verified review Amazon
A. Zubarev May 10, 2015
Full star icon Full star icon Full star icon Full star icon Full star icon 5
It is hard to underestimate the importance of the search nowadays. It probably even occurs without being noticed, but try to count how many times a day you tried to search for a product online, search through an online catalogue, an abbreviation or simply a weather forecast? Even a simple one page document or a webpage offers search capabilities (Ctrl-F). But have you ever wondered about how fast the search through Wiki is or how exact it is, even correcting your misspellings?These and more elaborate searches are a product of very powerful software. Typically thanks the Lucene index, it is like standing on the shoulders of a giant, Solr and Elasticsearch are capable of scouting through a sea of documents and terms in milliseconds, boosting the most relevant results to the top helping human or robot deliver business insight, guide through darkness of overwhelming amounts of information to the decision or helping buy the correct product.It becomes very obvious that these products encapsulate tons of advanced features and boast an array of capabilities, but sifting through the myriad of the features may at times become exhausting, and sure time consuming.This is where the excellent technical literature as Mastering Elasticsearch 2nd Edition makes a lot of sense. Please note, this is the 2nd edition in a very short period of time (less than two years). What it means, there are two things. First, the book is very popular so the authors get a lot of support and demand for a sequel, second, the technology is evolving fast (~ 100 pages added). All these are good news and a confirmation that Elasticsearch is a mature yet promising technology that is here to stay. It will not be needless to state that this book is seen by the authors as a companion book to the Elasticsearch Server 2nd Edition that I did not read, but the authors stress out that it is a good idea to start from one.The Mastering Elasticsearch book does feel like aiming at the search engineers, or those who already is involved in conceiving or using a product that will utilize the search capabilities of Elasticsearch. It is full of practical advice, insight and examples that are ranging from fine-tuning the searches to setting and properly configuring the cluster up. There is a chapter toward the end about how to crate plugins to any software project.I liked the following parts in the book: boosting search scores, using Groovy as a scripting language, troubleshooting and speeding up performance.Some knowledge of Java is assumed, but no special tooling or software is necessary to go through the book. But please be aware that you will type a lot of text, JSON specifically, so you may want an editor that has good support for JSON especially color highlighting e.g. the Eclipse JSON plugin. Groovy was used very lightly and all the examples were very eloquent.On the missing thing part, I did not see any examples on how to execute geospatial searched event though it was mentioned that these are possible, and I was highly interested in it.It does not reduce my score even a bit though, this is an example of a very hard work on the part of the authors and publisher, five our of five.
Amazon Verified review Amazon
Matteo May 09, 2015
Full star icon Full star icon Full star icon Full star icon Full star icon 5
This book provide a quick start into elasticsearch, focusing on a intermediate-to-expert audience. Readers that have zero knowledge on the subject might find themselves a bit lost(the topic is far from simple after all) but doing some exercises using other sources can provide you a decent starting point to understand the book topics(the books requires a working farm to run the samples, making it happen is up to you). The book writing is simple but the topic is not going to help so don't worry if you feel a bit lost at times. If you need to set up a decent search engine for metadata analysis on a shoestring budget this book will give you extra insight on how to create and manage complex queries.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.