You're reading from Kibana 7 Quick Start Guide Visualize your Elasticsearch data with ease

Product type Paperback

Published in Jan 2019

Publisher Packt

ISBN-13 9781789804034

Length 172 pages

Edition 1st Edition

Tools

Elasticsearch

Concepts

Data Visualization

Author (1):

Anurag Srivastava

View More author details

Elastic Stack

Kibana with Elastic Stack can be used to fetch data from different sources and filter, process, and analyze it to create meaningful dashboards. Elastic Stack has the following components:

Elasticsearch: We can store data in Elasticsearch.
Logstash: A data pipeline that we can use to read data from various sources, and can write it to various sources. It also provides a feature to filter the input data before sending it to output.
Kibana: A graphical user interface that we can use to do a lot of things, which I will cover in this chapter.
Beats: Lightweight data shippers that sit on different servers and send data to Elasticsearch directly or via Logstash:
- Filebeat
- Metricbeat
- Packetbeat
- Auditbeat
- Winlogbeat
- Heartbeat

The following diagram shows how Elastic Stack works:

In the preceding diagram, we have three different servers on which we have installed and configured Beats. These Beats are shipping data to Elasticsearch directly or via Logstash. Once this data is pushed into Elasticsearch, we can analyze, visualize, and monitor the data in Kibana. Let's discuss these components in detail; we're going to start with Elasticsearch.

Elasticsearch

Elasticsearch is a full-text search engine that's primarily used for searching. It can also be used as a NoSQL database and analytics engine. Elasticsearch is basically schema-less and works in near-real-time. It has a RESTful interface, which helps us to interact with it easily from multiple interfaces. Elasticsearch supports different ways of importing various types of structured or unstructured data; it handles all types of data because of its schema-less behavior, and it's quite easy to scale. We have different clients available in Elasticsearch for the following languages:

Java
PHP
Perl
Python
.NET
Ruby
JavaScript
Groovy

Its query API is quite robust and we can execute different types of queries, such as boosting some fields over other fields, writing fuzzy queries, or searching on single or multiple fields, along with field search. Applying a Boolean search or wildcard search aggregation is another important feature of Elasticsearch, which helps us to aggregate different types of data; it has multiple types of aggregations, such as metric aggregation, bucket aggregation, and term aggregation.

In fuzzy queries, we match words even then if there's no exact match for the spelling. For example, if we try to search a word with the wrong spelling, we can get the correct result using fuzzy search.

The architecture of Elasticsearch has the following components:

Cluster: A collection of one or more nodes that work together is known as a cluster. By default, the cluster name is elasticsearch, which we can change to any unique name.
Node: A node represents a single Elasticsearch server, which is identified by a universally unique identifier (UUID).
Index: A collection of documents where each document in the collection has a common attribute.
Type: A logical partition of the index to store more than one type of document. Type was supported in previous versions and is deprecated from 6.0.0 onward.
Document: A single record in Elasticsearch is known as a document.
Shard: We can subdivide the Elasticsearch index into multiple pieces, which are called shards. During indexing, we can provide the number of shards required.

Elasticsearch is primarily used to store and search data in the Elastic Stack; Kibana picks this data from Elasticsearch and uses it to analyzes or visualizes it in the form of charts and more, which can be combined to create dashboards.

Logstash

Logstash is a data pipeline that can take data input from various sources, filter it, and output it to various sources; these sources can be files, Kafka, or databases. Logstash is a very important tool in Elastic Stack as it's primarily used to pull data from various sources and push it to Elasticsearch; from there, Kibana can use that data for analysis or visualization. We can take any type of data using Logstash, such as structured or unstructured data , which comes from various sources, such as the internet. The data can be transformed using Logstash's filter option, which has different plugins to play with different sets of data. For example, if we get an IP address in our data, the GeoIP plugin can add geolocation using that IP address, and in the output, we can get additional information of geolocation, which can then be used in Kibana to plot a map.

The following expression shows us an example of a Logstash configuration file:

input 
{ 
    file 
    { 
        path => "/var/log/apache2/access.log" 
    } 
} 
filter 
{ 
    grok 
    { 
        match => {message => "%{COMBINEDAPACHELOG}"}
    }
}
output 
{ 
    elasticsearch 
    { 
        hosts => "localhost" 
    }
}

In the preceding expression, we have three sections: input, filter, and output. In the input section, we're reading the Apache access log file data. The filter section is there to extract Apache access log data in different fields, using the grok filter option. The output section is quite straightforward as it's pushing the data to the local Elasticsearch cluster. We can configure the input and output sections to read or write from or to different sources, whereas we can apply different plugins to transform the input data; for example, we can mutate a field, transform a field value, or add geolocation from an IP address using the filter option.

Grok is a tool that we can use to generate structured and queryable data by parsing unstructured data.

Kibana

In Elastic Stack, Kibana is mainly used to provide the graphical user interface, which we use to do multiple things. When Kibana was first released, we just used it to create charts and histograms, but with each update, Kibana evolves and now we have lots of killer features that make Kibana stand out from the crowd. There are many features in Kibana, but when we talk about the key features, they are as follows:

Discover your data by exploring it
Analyze your data by applying different metrics
Visualize your data by creating different types of charts
Apply machine learning on your data to get data anomaly and future trends

Monitor your application using APM
Manage users and roles
A console to run Elasticsearch expressions
Play with time-series data using Timelion
Monitor your Elastic Stack using Monitoring

Application Performance Monitoring (APM) is built on top of an Elastic Stack that we use to monitor application and software services in real time. We'll look at APM in more detail in Chapter 6, Monitoring Applications with APM.

In this way, there are different use cases that can be handled well using Kibana. I'm going to explain each of them in later chapters.

Beats

Beats are single-purpose, lightweight data shippers that we use to get data from different servers. Beats can be installed on the servers as a lightweight agent to send system metrics, or process or file data to Logstash or Elasticsearch. They gather data from the machine on which they are installed and then send that data to Logstash, which we use to parse or transform the data before sending it to Elasticsearch, or we can send the Beats data directly into Elasticsearch.

They are quite handy as it takes almost no time to install and configure Beats to start sending data from the server on which they're installed. They're written to target specific requirements and work really well to solve use cases. Filebeat is there to work with different files like Apache log files or any other files, they keep a watch on the files, and as soon as an update happens, the updated data is shipped to Logstash or Elasticsearch. This file operation can also be configured using Logstash, but that may require some tuning; Filebeat is very easy to configure in comparison to Logstash.

Another advantage is that they have a smaller footprint and they sit on the servers from where we want the monitoring data to be sent. This makes the system quite simple because the collection of data happens on the remote machine, and then this data is sent to a centralized Elasticsearch cluster directly, or via Logstash. One more feature that makes Beats an important component of the Elastic Stack is the built-in Dashboard, which can be created in no time. We have a simple configuration in Beats to create a monitoring Dashboard in Kibana, which can be used to monitor directly or we might have to do some minor changes to use it for monitoring. There are different types of Beats, which we'll discuss here.

Filebeat

Filebeat is a lightweight data shipper that forwards log data from different servers to a central place, where we can analyze that log data. Filebeat monitors the log files that we specify, collects the data from there in an incremental way, and then forwards them to Logstash, or directly into Elasticsearch for indexing.

After configuring Filebeat, it starts the input as per the given instructions. Filebeat starts a harvester to read a single log to get the incremental data for each separate file. Harvester sends the log data to libbeat, and then libbeat aggregates all events and sends the data to the output as per the given instructions like in Elasticsearch, Kafka, or Logstash.

Metricbeat

Another lightweight data shipper that can be installed on any server to fetch system metrics. Metricbeat helps us to collect metrics from systems and services and to monitor the servers. Metrics are running on those servers, on which we installed Metricbeat. Metricbeat ships the collected system metrics data to Elasticsearch Logstash for analysis. Metricbeat can monitor many different services, as follows:

MySQL
PostgreSQL
Apache
Nginx
Redis
HAProxy

I've listed only some of the services, Metricbeat supports a lot more than that.

Packetbeat

Packetbeat is used to analyze network packets in real time. Packetbeat data can be pushed to Elasticsearch, which we can use to configure Kibana for real-time application monitoring. Packetbeat is very effective in diagnosing network-related issues, since it captures the network traffic between our application servers and it decodes the application-layer protocols, such as HTTP, Redis, and MySQL. Also, it correlates the request and response, and captures important fields.

Packetbeat supports the following protocols:

HTTP
MySQL
PostgreSQL
Redis
MongoDB
Memcache
TLS
DNS

Using Packetbeat, we can send our network packet data directly into Elasticsearch or through Logstash. Packetbeat is a handy tool since it's difficult to monitor the network packet. Just install and configure it on the server where you want to monitor the network packets and start getting the packet data into Elasticsearch using which, we can create packet data monitoring dashboard. Packetbeat also provides a custom dashboard that we can easily configure using the Packetbeat configuration file.

Auditbeat

Auditbeat can be installed and configured on any server to audit the activities of users and processes. It's a lightweight data shipper that sends the data directly to Elasticsearch or using Logstash. Sometimes it's difficult to track changes in binaries or configuration files; Auditbeat is helpful here because it detects changes to critical files, such as different configuration files and binaries.

We can configure Auditbeat to fetch audit events from the Linux audit framework. The Linux audit framework is an auditing system that collects the information of different events on the system. Auditbeat can help us to take that data and push it to Elasticsearch from where Kibana can be utilized to create dashboards.

Winlogbeat

Winlogbeat is a data shipper that ships the Windows event logs to Logstash or the Elasticsearch cluster. It keeps a watch and reads from different Windows event logs and sends them to Logstash or Elasticsearch in a timely manner. Winlogbeat can send different types of events:

Hardware Events
Security Events

System Events
Application Events

Winlogbeat sends structured data to Logstash or Elasticsearch after reading raw event data to make it easy for filtering and aggregating the data.

Heartbeat

Heartbeat is a lightweight shipper that monitors server uptime. It can be installed on a remote server; after that, it periodically checks the status of different services and tell us whether they're available. The major difference between Metricbeat and Heartbeat is that Metricbeat tells us whether that server is up or down, while Heartbeat tells us whether services are reachable—it's quite similar to the ping command, which tells us whether the server is responding.