Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Mastering Elasticsearch - Second Edition
Mastering Elasticsearch - Second Edition

Mastering Elasticsearch - Second Edition: Further your knowledge of the Elasticsearch server by learning more about its internals, querying, and data handling , Second Edition

Arrow left icon
Profile Icon Marek Rogozinski
Arrow right icon
$19.99 per month
Full star icon Full star icon Full star icon Full star icon Half star icon 4.3 (9 Ratings)
Paperback Feb 2015 434 pages 2nd Edition
eBook
$24.99 $36.99
Paperback
$60.99
Subscription
Free Trial
Renews at $19.99p/m
Arrow left icon
Profile Icon Marek Rogozinski
Arrow right icon
$19.99 per month
Full star icon Full star icon Full star icon Full star icon Half star icon 4.3 (9 Ratings)
Paperback Feb 2015 434 pages 2nd Edition
eBook
$24.99 $36.99
Paperback
$60.99
Subscription
Free Trial
Renews at $19.99p/m
eBook
$24.99 $36.99
Paperback
$60.99
Subscription
Free Trial
Renews at $19.99p/m

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing
Table of content icon View table of contents Preview book icon Preview Book

Mastering Elasticsearch - Second Edition

Chapter 2. Power User Query DSL

In the previous chapter, we looked at what Apache Lucene is, how its architecture looks, and how the analysis process is handled. In addition to these, we saw what Lucene query language is and how to use it. We also discussed Elasticsearch, its architecture, and core concepts. In this chapter, we will dive deep into Elasticsearch focusing on the Query DSL. We will first go through how Lucene scoring formula works before turning to advanced queries. By the end of this chapter, we will have covered the following topics:

  • How the default Apache Lucene scoring formula works
  • What query rewrite is
  • What query templates are and how to use them
  • How to leverage complicated Boolean queries
  • What are the performance implications of large Boolean queries
  • Which query you should use for your particular use case

Default Apache Lucene scoring explained

A very important part of the querying process in Apache Lucene is scoring. Scoring is the process of calculating the score property of a document in a scope of a given query. What is a score? A score is a factor that describes how well the document matched the query. In this section, we'll look at the default Apache Lucene scoring mechanism: the TF/IDF (term frequency/inverse document frequency) algorithm and how it affects the returned document. Knowing how this works is valuable when designing complicated queries and choosing which queries parts should be more relevant than the others. Knowing the basics of how scoring works in Lucene allows us to tune queries more easily and the results retuned by them to match our use case.

When a document is matched

When a document is returned by Lucene, it means that it matched the query we've sent. In such a case, the document is given a score. Sometimes, the score is the same for all the documents...

Query rewrite explained

We have already talked about scoring, which is valuable knowledge, especially when trying to improve the relevance of our queries. We also think that when debugging your queries, it is valuable to know how all the queries are executed; therefore, it is because of this we decided to include this section on how query rewrite works in Elasticsearch, why it is used, and how to control it.

If you have ever used queries, such as the prefix query and the wildcard query, basically any query that is said to be multiterm, you've probably heard about query rewriting. Elasticsearch does that because of performance reasons. The rewrite process is about changing the original, expensive query to a set of queries that are far less expensive from Lucene's point of view and thus speed up the query execution. The rewrite process is not visible to the client, but it is good to know that we can alter the rewrite process behavior. For example, let's look at what Elasticsearch...

Query templates

When the application grows, it is very probable that the environment will start to be more and more complicated. In your organization, you probably have developers who specialize in particular layers of the application—for example, you have at least one frontend designer and an engineer responsible for the database layer. It is very convenient to have the development divided into several modules because you can work on different parts of the application in parallel without the need of constant synchronization between individuals and the whole team. Of course, the book you are currently reading is not a book about project management, but search, so let's stick to that topic. In general, it would be useful, at least sometimes, to be able to extract all queries generated by the application, give them to a search engineer, and let him/her optimize them, in terms of both performance and relevance. In such a case, the application developers would only have to pass...

Handling filters and why it matters

Let's have a look at the filtering functionality provided by Elasticsearch. At first it may seem like a redundant functionality because almost all the filters have their query counterpart present in Elasticsearch Query DSL. But there must be something special about those filters because they are commonly used and they are advised when it comes to query performance. This section will discuss why filtering is important, how filters work, and what type of filtering is exposed by Elasticsearch.

Filters and query relevance

The first difference when comparing queries to filters is the influence on the document score. Let's compare queries and filters to see what to expect. We will start with the following query:

curl -XGET "http://127.0.0.1:9200/library/_search?pretty" -d'
{
    "query": {
        "term": {
           "title": {
              "value": "front"
           }
        }
   ...

Choosing the right query for the job

In our Elasticsearch Server Second Edition, we described the full query language, the so-called Query DSL provided by Elasticsearch. A JSON structured query language that allows us to virtually build as complex queries as we can imagine. What we didn't talk about is when the queries can be used and when they should be used. For a person who doesn't have much prior experience with a full text search engine, the number of queries exposed by Elasticsearch can be overwhelming and very confusing. Because of that, we decided to extend what we wrote in the second edition of our first Elasticsearch book and show you, the reader, what you can do with Elasticsearch.

We decided to divide the following section into two distinct parts. The first part will try to categorize the queries and tell you what to expect from a query in that category. The second part will show you an example usage of queries from each group and will discuss the differences. Please...

Default Apache Lucene scoring explained


A very important part of the querying process in Apache Lucene is scoring. Scoring is the process of calculating the score property of a document in a scope of a given query. What is a score? A score is a factor that describes how well the document matched the query. In this section, we'll look at the default Apache Lucene scoring mechanism: the TF/IDF (term frequency/inverse document frequency) algorithm and how it affects the returned document. Knowing how this works is valuable when designing complicated queries and choosing which queries parts should be more relevant than the others. Knowing the basics of how scoring works in Lucene allows us to tune queries more easily and the results retuned by them to match our use case.

When a document is matched

When a document is returned by Lucene, it means that it matched the query we've sent. In such a case, the document is given a score. Sometimes, the score is the same for all the documents (like for...

Left arrow icon Right arrow icon

Description

This book is for Elasticsearch users who want to extend their knowledge and develop new skills. Prior knowledge of the Query DSL and data indexing is expected.

Who is this book for?

This book is for Elasticsearch users who want to extend their knowledge and develop new skills. Prior knowledge of the Query DSL and data indexing is expected.

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Feb 27, 2015
Length: 434 pages
Edition : 2nd
Language : English
ISBN-13 : 9781783553792
Vendor :
Elastic
Category :
Languages :

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing

Product Details

Publication date : Feb 27, 2015
Length: 434 pages
Edition : 2nd
Language : English
ISBN-13 : 9781783553792
Vendor :
Elastic
Category :
Languages :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total $ 176.97
ElasticSearch Cookbook - Second Edition
$60.99
Mastering Elasticsearch - Second Edition
$60.99
Elasticsearch Server: Second Edition
$54.99
Total $ 176.97 Stars icon
Banner background image

Table of Contents

10 Chapters
1. Introduction to Elasticsearch Chevron down icon Chevron up icon
2. Power User Query DSL Chevron down icon Chevron up icon
3. Not Only Full Text Search Chevron down icon Chevron up icon
4. Improving the User Search Experience Chevron down icon Chevron up icon
5. The Index Distribution Architecture Chevron down icon Chevron up icon
6. Low-level Index Control Chevron down icon Chevron up icon
7. Elasticsearch Administration Chevron down icon Chevron up icon
8. Improving Performance Chevron down icon Chevron up icon
9. Developing Elasticsearch Plugins Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Full star icon Half star icon 4.3
(9 Ratings)
5 star 66.7%
4 star 11.1%
3 star 11.1%
2 star 11.1%
1 star 0%
Filter icon Filter
Top Reviews

Filter reviews by




R. Somerfield Mar 24, 2015
Full star icon Full star icon Full star icon Full star icon Full star icon 5
I've been using ElasticSearch professionally for over 6 months - our first version product using ElasticSearch has shipped this week!Whilst it is fairly easy to get started with ElasticSearch, there are a lot of fundamental aspects to it (and its underpinnings) which can have a dramatic affect on using it. And, whilst there are some good examples, they tend to be fairly simplistic.Up until I got this book I'd been (extensively!!) relying on Google. And whilst I've eventually managed to work out the answers, it took a lot of searching and therefore a lot longer than I'd have ideally liked. In addition, finding individual snippets on the web doesn't help with some of the broad knowledge.I found this book to be an excellent guide to help me understand the underpinnings of ElasticSearch, and also helped me to make many improvements in my Google-aquired knowledge. I would recommend it to anyone how is spending any amount of time with ElasticSearch.
Amazon Verified review Amazon
adnan baloch Apr 27, 2015
Full star icon Full star icon Full star icon Full star icon Full star icon 5
First off, a disclaimer for newbies: This book is meant for intermediate users of the Elasticsearch Server. Still, the book begins with a short but comprehensive introduction to the basic concepts used in document indexing, the various node types in Elasticsearch and the Apache Lucene library that powers Elasticsearch under the hood. The examples in the book are based on the premise that the user is running an online bookstore which is a powerful way to explore the possibilities offered by Elasticsearch. The examples are in JSON document format which should be familiar to any serious developer. The score of a query determines how well the document matches the input query. The scoring formula is explained using asimple example to demonstrate how it works in practice. Query rewrite methods, filters and types of queries are explained in detail with a special focus on their performance impact. Simple use cases let the readers know when to use which query group. A new sandboxed scripting language called Groovy is introduced that enables on-the-fly calculation of document scores without compromising the security of the search server. Lucene expressions are also given a brief touch. Readers will enjoy the chapter on improving search suggestions which can make a real difference in the search experience of users. Plenty of examples in this chapter help to take the guesswork out of improving query relevance. Filtering garbage results and using term faceting to narrow down search results are discussed to give readers the power to tailor their websites according to their needs for maximum user satisfaction. Scaling to accommodate increasing demands requires the right amount of shards and replicas. Deciding this amount is explored with a practical routing example. The final few chapters deal with low level index control, Elasticsearch administration, performance improving techniques and developing Elasticsearch plugins. whether you have a single node or an entire cluster of nodes in the cloud, the sheer amount of information contained in over 400 pages of this book ensures that readers will find this book a worthy companion in their quest to tame and tune Elasticsearch server for blazing fast query speed and highly relevant search results.
Amazon Verified review Amazon
Massera Riccardo Apr 14, 2015
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Mastering Elasticsearch is a very well written and comprehensive book that helps professionals already working with Elasticsearch to understand how it works and how to get the most from it.This book deals with every aspect of running Elasticsearch ranging from mastering queries and indexing, to optimization, to administration, to scaling and performance tuning.The authors explain in a clear and concise manner every topic, even when they delve into the internals of Elasticsearch, showing the reader how things work under the hoods with many examples and pictures.The first part of the book is introductory and explains the general architecture of Elasticsearch, indexing, queries scoring internals and how to obtain a good user experience.One particular subject that is analysed throughout the book is the relationship of Elasticsearch with the Lucene indexing and search engine, since Elasticsearch is built on top of it.In the second part, the authors teach the most advanced topics, like sharding and index allocation, low level index control, routing, administration and scaling an Elasticsearch system, giving to the reader the keys to correctly design, dimension and evolve an efficient Elasticsearch cluster.Overall, if you are already working with Elasticsearch and you need to know how to detect and handle performance issues or to improve user experience or to scale your Elasticsearch environment, this is an excellent book to read.
Amazon Verified review Amazon
A. Zubarev May 10, 2015
Full star icon Full star icon Full star icon Full star icon Full star icon 5
It is hard to underestimate the importance of the search nowadays. It probably even occurs without being noticed, but try to count how many times a day you tried to search for a product online, search through an online catalogue, an abbreviation or simply a weather forecast? Even a simple one page document or a webpage offers search capabilities (Ctrl-F). But have you ever wondered about how fast the search through Wiki is or how exact it is, even correcting your misspellings?These and more elaborate searches are a product of very powerful software. Typically thanks the Lucene index, it is like standing on the shoulders of a giant, Solr and Elasticsearch are capable of scouting through a sea of documents and terms in milliseconds, boosting the most relevant results to the top helping human or robot deliver business insight, guide through darkness of overwhelming amounts of information to the decision or helping buy the correct product.It becomes very obvious that these products encapsulate tons of advanced features and boast an array of capabilities, but sifting through the myriad of the features may at times become exhausting, and sure time consuming.This is where the excellent technical literature as Mastering Elasticsearch 2nd Edition makes a lot of sense. Please note, this is the 2nd edition in a very short period of time (less than two years). What it means, there are two things. First, the book is very popular so the authors get a lot of support and demand for a sequel, second, the technology is evolving fast (~ 100 pages added). All these are good news and a confirmation that Elasticsearch is a mature yet promising technology that is here to stay. It will not be needless to state that this book is seen by the authors as a companion book to the Elasticsearch Server 2nd Edition that I did not read, but the authors stress out that it is a good idea to start from one.The Mastering Elasticsearch book does feel like aiming at the search engineers, or those who already is involved in conceiving or using a product that will utilize the search capabilities of Elasticsearch. It is full of practical advice, insight and examples that are ranging from fine-tuning the searches to setting and properly configuring the cluster up. There is a chapter toward the end about how to crate plugins to any software project.I liked the following parts in the book: boosting search scores, using Groovy as a scripting language, troubleshooting and speeding up performance.Some knowledge of Java is assumed, but no special tooling or software is necessary to go through the book. But please be aware that you will type a lot of text, JSON specifically, so you may want an editor that has good support for JSON especially color highlighting e.g. the Eclipse JSON plugin. Groovy was used very lightly and all the examples were very eloquent.On the missing thing part, I did not see any examples on how to execute geospatial searched event though it was mentioned that these are possible, and I was highly interested in it.It does not reduce my score even a bit though, this is an example of a very hard work on the part of the authors and publisher, five our of five.
Amazon Verified review Amazon
Matteo May 09, 2015
Full star icon Full star icon Full star icon Full star icon Full star icon 5
This book provide a quick start into elasticsearch, focusing on a intermediate-to-expert audience. Readers that have zero knowledge on the subject might find themselves a bit lost(the topic is far from simple after all) but doing some exercises using other sources can provide you a decent starting point to understand the book topics(the books requires a working farm to run the samples, making it happen is up to you). The book writing is simple but the topic is not going to help so don't worry if you feel a bit lost at times. If you need to set up a decent search engine for metadata analysis on a shoestring budget this book will give you extra insight on how to create and manage complex queries.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is included in a Packt subscription? Chevron down icon Chevron up icon

A subscription provides you with full access to view all Packt and licnesed content online, this includes exclusive access to Early Access titles. Depending on the tier chosen you can also earn credits and discounts to use for owning content

How can I cancel my subscription? Chevron down icon Chevron up icon

To cancel your subscription with us simply go to the account page - found in the top right of the page or at https://subscription.packtpub.com/my-account/subscription - From here you will see the ‘cancel subscription’ button in the grey box with your subscription information in.

What are credits? Chevron down icon Chevron up icon

Credits can be earned from reading 40 section of any title within the payment cycle - a month starting from the day of subscription payment. You also earn a Credit every month if you subscribe to our annual or 18 month plans. Credits can be used to buy books DRM free, the same way that you would pay for a book. Your credits can be found in the subscription homepage - subscription.packtpub.com - clicking on ‘the my’ library dropdown and selecting ‘credits’.

What happens if an Early Access Course is cancelled? Chevron down icon Chevron up icon

Projects are rarely cancelled, but sometimes it's unavoidable. If an Early Access course is cancelled or excessively delayed, you can exchange your purchase for another course. For further details, please contact us here.

Where can I send feedback about an Early Access title? Chevron down icon Chevron up icon

If you have any feedback about the product you're reading, or Early Access in general, then please fill out a contact form here and we'll make sure the feedback gets to the right team. 

Can I download the code files for Early Access titles? Chevron down icon Chevron up icon

We try to ensure that all books in Early Access have code available to use, download, and fork on GitHub. This helps us be more agile in the development of the book, and helps keep the often changing code base of new versions and new technologies as up to date as possible. Unfortunately, however, there will be rare cases when it is not possible for us to have downloadable code samples available until publication.

When we publish the book, the code files will also be available to download from the Packt website.

How accurate is the publication date? Chevron down icon Chevron up icon

The publication date is as accurate as we can be at any point in the project. Unfortunately, delays can happen. Often those delays are out of our control, such as changes to the technology code base or delays in the tech release. We do our best to give you an accurate estimate of the publication date at any given time, and as more chapters are delivered, the more accurate the delivery date will become.

How will I know when new chapters are ready? Chevron down icon Chevron up icon

We'll let you know every time there has been an update to a course that you've bought in Early Access. You'll get an email to let you know there has been a new chapter, or a change to a previous chapter. The new chapters are automatically added to your account, so you can also check back there any time you're ready and download or read them online.

I am a Packt subscriber, do I get Early Access? Chevron down icon Chevron up icon

Yes, all Early Access content is fully available through your subscription. You will need to have a paid for or active trial subscription in order to access all titles.

How is Early Access delivered? Chevron down icon Chevron up icon

Early Access is currently only available as a PDF or through our online reader. As we make changes or add new chapters, the files in your Packt account will be updated so you can download them again or view them online immediately.

How do I buy Early Access content? Chevron down icon Chevron up icon

Early Access is a way of us getting our content to you quicker, but the method of buying the Early Access course is still the same. Just find the course you want to buy, go through the check-out steps, and you’ll get a confirmation email from us with information and a link to the relevant Early Access courses.

What is Early Access? Chevron down icon Chevron up icon

Keeping up to date with the latest technology is difficult; new versions, new frameworks, new techniques. This feature gives you a head-start to our content, as it's being created. With Early Access you'll receive each chapter as it's written, and get regular updates throughout the product's development, as well as the final course as soon as it's ready.We created Early Access as a means of giving you the information you need, as soon as it's available. As we go through the process of developing a course, 99% of it can be ready but we can't publish until that last 1% falls in to place. Early Access helps to unlock the potential of our content early, to help you start your learning when you need it most. You not only get access to every chapter as it's delivered, edited, and updated, but you'll also get the finalized, DRM-free product to download in any format you want when it's published. As a member of Packt, you'll also be eligible for our exclusive offers, including a free course every day, and discounts on new and popular titles.