You're reading from Mastering Apache Solr 7.x An expert guide to advancing, optimizing, and scaling your enterprise search

Product type Paperback

Published in Feb 2018

Publisher Packt

ISBN-13 9781788837385

Length 308 pages

Edition 1st Edition

Languages

Java

Tools

Solr

Concepts

Enterprise Search

Authors (3):

Dharmesh Vasoya

Chintan Mehta

Sandeep Nair

View More author details

Why choose Solr?

If we already have a relational database, then why should we use Solr? It's simple; if there is a use case that needs you to search, you need a search engine platform like Solr. There are various use cases that we will be discussing further in the chapter.

Databases and Solr have their own pros and cons. In one place where we use a database, SQL supports limited wildcard-based text search with some basic normalization, such as matching uppercase to lowercase. It might be a costly query as it does full table scans. Whereas in Solr, a searchable word index is stored in an inverse index, which is much faster than traditional database searches.

Let's look at the following diagram to understand this better:

Having an enterprise search engine solution is must for an organization nowadays, it is having a prominent role in the aspect of getting information quickly with the help of searches. Not having such a search engine platform can result in insufficient information, inefficiency of productivity, and additional efforts due to duplication of work. Why? Just because of not having the right information available quickly, without a search; it is something that we can't even think of. Most such use cases comprise the following key requirements:

Data collected should be parsed and indexed. So, parsing and indexing is one of the important requirements of any enterprise search engine platform.
A search should provide the required results almost at runtime on the required datasets. Performance and relevance are two more key requirements.
The search engine platform should be able to crawl or collect all of the data that it would require to perform the search.
Integration of the search engine along with administration, monitoring, log management, and customization is something that we would be expecting.

Solr has been designed to have a powerful and flexible search that can be used by applications; whenever you want to serve data based on search patterns, Solr is the right fit.

Here is a high-level diagram that shows how Solr is integrated with an application:

The majority of popular websites, including many Intranet websites, have integrated search solutions to help users find relevant information quickly. User experience is a key element for any solution that we develop; and searching is one of the major features that cannot be ignored when we talk about user experience.

Benefits of keyword search

One of the basic needs a search engine should support is a keyword search, as that's the primary goal behind the search engine platform. In fact it is the first thing a user will start with. Keyword search is the most common technique used for a search engine and also for end users on our websites. It is a pretty common expectation nowadays to punch in a few keywords and quickly retrieve the relevant results. But what happens in the backend is something we need to take care of to ensure that the user experience doesn't deteriorate. Let's look at a few areas that we must consider in order to provide better outcomes for search engine platforms using Solr:

Relevant search with quick turnaround
Auto-correct spelling
Auto-suggestions
Synonyms
Multilingual support
Phrase handling—an option to search for a specific keyword or all keywords in a phrase provided
Expanded results if the user wants to view something beyond the top-ranked results

These features can be easily managed by Solr; so our next challenge is to provide relevant results with improved user experience.

Benefits of ranked results

Solr is not limited to finding relevant results for a user's search. Providing the end user with selection of the most relevant results, that are sorted, is important as well. We will be doing this using SQL to find relevant matching pattern results and sorting them into columns in either ascending or descending order. Similarly, Solr also does sorting of the result set retrieved based on the search pattern, with a score that would match the relevancy strength in the dataset.

Ranked results is very important, primarily because the volume of data that search engine platforms have to dig through is huge. If there is no control on ranked results, then the result set would be filled with no relevancy and would have so much data that it wouldn't be feasible to display it either. The other important aspect is user experience. All of us are now used to expecting a search engine to provide relevant results using limited keywords. We are getting restless, aren't we? But we expect a search engine platform to not get annoyed and provide us relevant ranked results with few keywords. Hold on, we are not talking of Google search here! So for users like us, Solr can help address such situations by providing higher rankings based on various criteria: fields, terms, document name, and a few more. The ranking of the dataset can vary based on many factors, but a higher ranking would generally be based on the relevancy of the search pattern. With this, we can also have criteria such as gender; with the rankings of certain documents being at the top.

You're reading from Mastering Apache Solr 7.x An expert guide to advancing, optimizing, and scaling your enterprise search

Table of Contents (10) Chapters

Why choose Solr?

Benefits of keyword search

Benefits of ranked results

Authors (3)

Other recommended products

Personalised recommendations for you

You're reading from Mastering Apache Solr 7.x An expert guide to advancing, optimizing, and scaling your enterprise search

Table of Contents (10) Chapters

Unlock this book and the full library FREE for 7 days

Authors (3)

Other recommended products

Personalised recommendations for you