Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Free Learning

You're reading from Mastering Apache Solr 7.x An expert guide to advancing, optimizing, and scaling your enterprise search

Product type Paperback

Published in Feb 2018

Publisher Packt

ISBN-13 9781788837385

Length 308 pages

Edition 1st Edition

Languages

Java

Tools

Solr

Concepts

Enterprise Search

Authors (3):

Dharmesh Vasoya

Chintan Mehta

Sandeep Nair

View More author details

Table of Contents (10) Chapters

Preface

1. Introduction to Solr 7

2. Getting Started FREE CHAPTER

3. Designing Schemas

4. Mastering Text Analysis Methodologies

5. Data Indexing and Operations

6. Advanced Queries – Part I

7. Advanced Queries – Part II

8. Managing and Fine-Tuning Solr

9. Client APIs – An Overview

Understanding tokenizers

We have previously seen that an analyzer may be a single class or a set of defined tokenizer and filter classes.

The analyzer executes the analysis process in two steps:

Tokenization (parsing): Using configured tokenizer classes
Filtering (transformation): Using configured filter classes

We can also do preprocessing on a character stream before tokenization; we can do this with the help of CharFilters (we will see this later in the chapter). An analyzer knows its configured field, but a tokenizer doesn't have any idea about the field. The job of the tokenizer is only to read from a character stream, apply a tokenization mechanism based on its behavior, and produce a new sequence of a token stream.

What is a tokenizer?

...

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $19.99/month. Cancel anytime

Authors (3)

Nair

Nishanth Nair is a Mobile Solutions Architect, currently working as a Consultant for Sears Holdings Corporation. He holds a bachelor's degree in Computer Science and Engineering and has extensive experience with .NET technologies working for companies such as Accenture, McAfee, and Neudesic. He is a Microsoft Certified Application Developer and a Microsoft Certified Technology Specialist. In his free time, he likes to play cricket, tennis, and watch movies.

See other products by Nair

Dharmesh Vasoya

Dharmesh Vasoya is a Liferay 6.2 certified developer. He has 5.5 years of experience in application development with technologies such as Java, Liferay, Spring, Hibernate, Portlet, and JSF. He has successfully delivered projects in various domains, such as healthcare, collaboration, communication, and enterprise CMS, using Liferay. Dharmesh has good command of the configuration setup of servers such as Solr, Tomcat, JBOSS, and Apache Web Server. He has good experience of clustering, load balancing and performance tuning. He completed his MCA at Ahmedabad University.

See other products by Dharmesh Vasoya

Mehta

Chintan Mehta is a co-founder of KNOWARTH Technologies and heads the cloud/RIMS/DevOps team. He has rich, progressive experience in server administration of Linux, AWS Cloud, DevOps, RIMS, and on open source technologies. He is also an AWS Certified Solutions Architect. Chintan has authored MySQL 8 for Big Data, Mastering Apache Solr 7.x, MySQL 8 Administrator's Guide, and Hadoop Backup and Recovery Solutions. Also, he has reviewed Liferay Portal Performance Best Practices and Building Serverless Web Applications.

See other products by Mehta