You're reading from PostgreSQL High Performance Cookbook Mastering query optimization, database monitoring, and performance-tuning for PostgreSQL

Product type Paperback

Published in Mar 2017

Publisher Packt

ISBN-13 9781785284335

Length 360 pages

Edition 1st Edition

Languages

SQL

Tools

PostgreSQL

Concepts

High Performance Programming

Authors (2):

Chitij Chauhan

Dinesh Kumar

View More author details

Table of Contents (13) Chapters

Preface

1. Database Benchmarking FREE CHAPTER

2. Server Configuration and Control

3. Device Optimization

4. Monitoring Server Performance

5. Connection Pooling and Database Partitioning

6. High Availability and Replication

7. Working with Third-Party Replication Management Utilities

8. Database Monitoring and Performance

9. Vacuum Internals

10. Data Migration from Other Databases to PostgreSQL and Upgrading the PostgreSQL Cluster

11. Query Optimization

12. Database Indexing

Discussing RAID levels

In this recipe, we will be discussing about various RAID levels and their unique usage.

Getting ready

In this recipe, we will be discussing several RAID levels, which we configure for database requirements. RAID (Redundant Array of Interdependent Disks) has a dedicated hardware controller to deal with multiple disks, including a separate processor along with a battery backup cache, where data can be flushed to disk properly when a power failure occurs.

How to do it...

RAID levels can be differentiated as per their configurations. RAID supports configuration techniques such as striping, mirroring, and parity to improve the disk storage performance, or high availability. The most popular RAID levels are zero to six, and each level provides its own kind of disk storage capacity, read/write performance and high availability. The common RAID levels we configure for DBMS are 0, 1, 5, 6, or 10 (1 and 0).

How it works...

Let us discuss about how the mostly used RAID level works:

RAID 0

This configuration only focuses on read/write performance by striping the data across multiple devices. With this configuration, we can allocate the complete disk storage for the applications data. The major drawback in this configuration is no high availability. In the case of any single disk failure, it will cause the remaining disks to be useless as they are missing the chunks from the failed disk. This is a not recommended RAID configuration for real-time database systems, but it is a recommended configuration for storing non-critical business data such as historical application logs, database logs, and so on.

RAID 1

This configuration is only to focus on high availability rather than on performance, by broadcasting the data among two disk drives. That is, a single copy of the data will be kept on two disks. If one disk is corrupted, then we can still use the other one for read/write operations. This is also not a recommended configuration for real-time database systems, as it is lacking the write performance. Also, in this configuration, we will be utilizing 50% of the disk to store the actual data, and the rest to keep its duplicated information for high availability. This is a recommended configuration where the durability of data matters when compared with write performance.

RAID 5

This configuration provides more storage and high availability on the disk, by storing the parity blocks across the disks. Unlike RAID 1, it offers more disk space to keep the actual data, as parity blocks are spread among the disks. In any case, if one disk is corrupted, then we can use the parity blocks from the other disk, to fetch the missing data. However, this is also not a recommended configuration, since every read/write operation on the disk needs to process the parity blocks, to get the actual data out of it.

RAID 6

This configuration provides more redundancy than RAID 5 by storing the two parity blocks information for each write operation. That is, if both disks become corrupted, RAID 6 can still get the data from the parity blocks, unlike RAID 5. This configuration is also not recommended for the database systems, as write performance is less as compared than previous RAID levels.

RAID 10

This configuration is the combination of RAID levels 0 and 1. That is, the data will be striped to multiple disks and will be replicated to another disk storage. It is the most recommended RAID level for real-time business applications, where we achieve a better performance than with RAID 1, and higher availability than RAID 0.