Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
PostgreSQL Administration Cookbook, 9.5/9.6 Edition

You're reading from   PostgreSQL Administration Cookbook, 9.5/9.6 Edition Effective database management for administrators

Arrow left icon
Product type Paperback
Published in Apr 2017
Publisher
ISBN-13 9781785883187
Length 556 pages
Edition 3rd Edition
Languages
Arrow right icon
Authors (4):
Arrow left icon
Gianni Ciolli Gianni Ciolli
Author Profile Icon Gianni Ciolli
Gianni Ciolli
Hannu Krosing Hannu Krosing
Author Profile Icon Hannu Krosing
Hannu Krosing
Gabriele Bartolini Gabriele Bartolini
Author Profile Icon Gabriele Bartolini
Gabriele Bartolini
Simon Riggs Simon Riggs
Author Profile Icon Simon Riggs
Simon Riggs
Arrow right icon
View More author details
Toc

Table of Contents (13) Chapters Close

Preface 1. First Steps 2. Exploring the Database FREE CHAPTER 3. Configuration 4. Server Control 5. Tables and Data 6. Security 7. Database Administration 8. Monitoring and Diagnosis 9. Regular Maintenance 10. Performance and Concurrency 11. Backup and Recovery 12. Replication and Upgrades

Introduction

PostgreSQL is a feature-rich, general-purpose database management system. It's a complex piece of software, but every journey begins with the first step.

We'll start with your first connection. Many people fall at the first hurdle, so we'll try not to skip that too swiftly. We'll quickly move on to enabling remote users, and from there, we will move on to access through GUI administration tools.

We will also introduce the psql query tool, which is the tool used to load our sample database, as well as many other examples in the book.

For additional help, we've included a few useful recipes that you may need for reference.

Introducing PostgreSQL 9.6

PostgreSQL is an advanced SQL database server, available on a wide range of platforms. One of the clearest benefits of PostgreSQL is that it is open source, meaning that you have a very permissive license to install, use, and distribute PostgreSQL without paying anyone any fees or royalties. On top of that, PostgreSQL is well known as a database that stays up for long periods and requires little or no maintenance in most cases. Overall, PostgreSQL provides a very low total cost of ownership.

PostgreSQL is also noted for its huge range of advanced features, developed over the course of more than 30 years of continuous development and enhancement. Originally developed by the Database Research Group at the University of California, Berkeley, PostgreSQL is now developed and maintained by a huge army of developers and contributors. Many of these contributors have full-time jobs related to PostgreSQL, working as designers, developers, database administrators, and trainers. Some, but not many, of these contributors work for companies that specialize in support for PostgreSQL, like we (the authors) do. No single company owns PostgreSQL, nor are you required (or even encouraged) to register your usage.

PostgreSQL has the following main features:

  • Excellent SQL standards compliance up to SQL: 2011
  • Client-server architecture
  • Highly concurrent design where readers and writers
    don't block each other
  • Highly configurable and extensible for many types of applications
  • Excellent scalability and performance with extensive tuning features
  • Support for many kinds of data models such as relational, post-relational (arrays, nested relations via record types) document (JSON and XML), and key/value

What makes PostgreSQL different?

The PostgreSQL project focuses on the following objectives:

  • Robust, high-quality software with maintainable, well-commented code
  • Low maintenance administration for both embedded and enterprise use
  • Standards-compliant SQL, interoperability, and compatibility
  • Performance, security, and high availability

What surprises many people is that PostgreSQL's feature set is more comparable with Oracle or SQL Server than it is with MySQL. The only connection between MySQL and PostgreSQL is that these two projects are open source; apart from that, the features and philosophies are almost totally different.

One of the key features of Oracle, since Oracle 7, has been snapshot isolation, where readers don't block writers and writers don't block readers. You may be surprised to learn that PostgreSQL was the first database to be designed with this feature, and it offers a complete implementation. In PostgreSQL, this feature is called Multiversion Concurrency Control (MVCC), and we will discuss this in more detail later in the book.

PostgreSQL is a general-purpose database management system. You define the database that you would like to manage with it. PostgreSQL offers you many ways to work. You can either use a normalized database model, augmented with features such as arrays and record subtypes, or use a fully dynamic schema with the help of JSONB and an extension named hstore. PostgreSQL also allows you to create your own server-side functions in any of the dozen different languages.

PostgreSQL is highly extensible, so you can add your own data types, operators, index types, and functional languages. You can even override different parts of the system using plugins to alter the execution of commands or add a new optimizer.

All of these features offer a huge range of implementation options to software architects. There are many ways out of trouble when building applications and maintaining them over long periods of time. Regrettably, we simply don't have space in this book for all the cool features for developers; this book is about administration, maintenance, and backup.

In the early days, when PostgreSQL was still a research database, the focus was solely on the cool new features. Over the last 20 years, enormous amounts of code have been rewritten and improved, giving us one of the most stable and largest software servers available for operational use.

You may have read that PostgreSQL was, or is, slower than My Favorite DBMS, whichever that is. It's been a personal mission of mine over the last ten years to improve server performance, and the team has been successful in making the server highly performant and very scalable. That gives PostgreSQL enormous headroom for growth.

Who is using PostgreSQL? Prominent users include Apple, BASF, Genentech, Heroku, IMDB.com, Skype, McAfee, NTT, The UK Met Office, and The U. S. National Weather Service. Five years ago, PostgreSQL received well in excess of 1 million downloads per year, according to data submitted to the European Commission, which concluded, PostgreSQL is considered by many database users to be a credible alternative.

We need to mention one last thing. When PostgreSQL was first developed, it was named Postgres, and therefore many aspects of the project still refer to the word postgres, for example, the default database is named postgres, and the software is frequently installed using the postgres user ID. As a result, people shorten the name PostgreSQL to simply Postgres, and in many cases use the two names interchangeably.

PostgreSQL is pronounced as post-grez-q-l. Postgres is pronounced as post-grez.

Some people get confused and refer to it as Postgre, which is hard to say and likely to confuse people. Two names are enough, so don't use a third name!

The following sections explain the key areas in more detail.

Robustness

PostgreSQL is robust, high-quality software, supported by automated testing for both features and concurrency. By default, the database provides strong disk-write guarantees, and the developers take the risk of data loss very seriously in everything they do. Options to trade robustness for performance exist, though they are not enabled by default.

All actions on the database are performed within transactions, protected by a transaction log that will perform automatic crash recovery in case of software failure.

Databases may be optionally created with data block checksums to help diagnose hardware faults. Multiple backup mechanisms exist, with full and detailed Point-in-time recovery (PITR), in case of the need for detailed recovery. A variety of diagnostic tools are available as well.

Database replication is supported natively. Synchronous Replication can provide greater than 5 nines (99.999 percent) availability and data protection, if properly configured and managed.

Security

Access to PostgreSQL is controllable via host-based access rules. Authentication is flexible and pluggable, allowing easy integration with any external security architecture.

Full SSL-encrypted access is supported natively. A full-featured cryptographic function library is available for database users.

PostgreSQL provides role-based access privileges to access data, by command type. PostgreSQL also provides Row Level Security for privacy, medical and military grade security

Functions may execute with the permissions of the definer, while views may be defined with security barriers to ensure that security is enforced ahead of other processing.

All aspects of PostgreSQL are assessed by an active security team, while known exploits are categorized and reported at http://www.postgresql.org/support/security/.

Ease of use

Clear, full, and accurate documentation exists as a result of a development process where doc changes are required. Hundreds of small changes occur with each release that smooth off any rough edges of usage, supplied directly by knowledgeable users.

PostgreSQL works in the same way on small and large systems and across operating systems.

Client access and drivers exist for every language and environment, so there is no restriction on what type of development environment is chosen now, or in the future.

SQL standard is followed very closely; there is no weird behavior, such as silent truncation of data.

Text data is supported via a single data type that allows storage of anything from 1 byte to 1 gigabyte. This storage is optimized in multiple ways, so 1 byte is stored efficiently, and much larger values are automatically managed and compressed.

PostgreSQL has a clear policy to minimize the number of configuration parameters, and with each release, we work out ways to auto-tune the settings.

Extensibility

PostgreSQL is designed to be highly extensible. Database extensions can be loaded easily using CREATE EXTENSION, which automates version checks, dependencies, and other aspects of configuration.

PostgreSQL supports user-defined data types, operators, indexes, functions, and languages.

Many extensions are available for PostgreSQL, including the PostGIS extension that provides world-class Geographical Information System (GIS) features.

Performance and concurrency

PostgreSQL 9.6 can achieve more than 1 million reads per second on a 4 socket server, and it benchmarks at more than 30,000 write transactions per second with full durability.

PostgreSQL has an advanced optimizer that considers a variety of join types, utilizing user data statistics to guide its choices. PostgreSQL provides the widest range of index types of any commonly available database server, fully supporting all data types.

PostgreSQL provides MVCC, which enables readers and writers to avoid blocking each other.

Taken together, the performance features of PostgreSQL allow a mixed workload of transactional systems and complex search and analytical tasks. This is important because it means we don't always need to unload our data from production systems and reload them into analytical data stores just to execute a few ad hoc queries. PostgreSQL's capabilities make it the database of choice for new systems, as well as the right long-term choice in almost every case.

Scalability

PostgreSQL 9.6 scales well on a single node up to 4 CPU sockets. PostgreSQL scales well up to hundreds of active sessions, and up to thousands of connected sessions when using a session pool. Further scalability is achieved in each annual release.

PostgreSQL provides multi-node read scalability using the Hot Standby feature. Multi-node write scalability is under active development. The starting point for this is Bi-Directional Replication (discussed in Chapter 12, Replication and Upgrades).

SQL and NoSQL

PostgreSQL follows the SQL Standard very closely. SQL itself does not force any particular type of model to be used, so PostgreSQL can easily be used for many types of models at the same time, in the same database.

With PostgreSQL acting as a relational database, we can utilize any level of denormalization, from the full Third Normal Form, to the more normalized Star Schema models. PostgreSQL extends the relational model to provide arrays, row types, and range types.

A document-centric database is also possible using PostgreSQL's text, XML, and binary JSON (JSONB) data types, supported by indexes optimized for documents and by full text search capabilities.

Key/value stores are supported using the hstore extension.

Popularity

When MySQL was taken over by a commercial database vendor some years back, it was agreed in the EU monopoly investigation that followed that PostgreSQL was a viable competitor. That's certainly been true, with the PostgreSQL user base expanding consistently for more than a decade.

Various polls have indicated that PostgreSQL is the favorite database for building new, enterprise-class applications. The PostgreSQL feature set attracts serious users who have serious applications. Financial services companies may be PostgreSQL's largest user group, though governments, telecommunication companies, and many other segments are strong users as well. This popularity extends across the world; Japan, Ecuador, Argentina, and Russia have very large user groups, and so do USA, Europe, and Australasia.

Amazon Web Services' chief technology officer Dr. Werner Vogels described PostgreSQL as an amazing database, going on to say that PostgreSQL has become the preferred open source relational database for many enterprise developers and start-ups, powering leading geospatial and mobile applications. AWS have more recently revealed that PostgreSQL is their fastest growing service.

Commercial support

Many people have commented that strong commercial support is what enterprises need before they can invest in open source technology. Strong support is available worldwide from a number of companies.

The authors work for 2ndQuadrant, which provides commercial support for open source PostgreSQL, offering 24 x 7 support in English and Spanish with bug-fix resolution times.

Many other companies provide strong and knowledgeable support to specific geographic regions, vertical markets, and specialized technology stacks.

PostgreSQL is also available as hosted or cloud solutions from a variety of companies, since it runs very well in cloud environments.

A full list of companies is kept up to date at http://www.postgresql.org/support/professional_support/.

Research and development funding

PostgreSQL was originally developed as a research project at the University of California, Berkeley in the late 1980s and early 1990s. Further work was carried out by volunteers until the late 1990s. Then, the first professional developer became involved. Over time, more and more companies and research groups became involved, supporting many professional contributors. Further funding for research and development was provided by the NSF. The project also received funding from the EU FP7 Programme in the form of the 4CaaST project for cloud computing and the AXLE project for scalable data analytics. AXLE deserves a special mention because it was a 3-year project aimed at enhancing PostgreSQL's business intelligence capabilities, specifically for very large databases. The project covered security, privacy, integration with data mining, and visualization tools and interfaces for new hardware.
Further details of it are available at http://www.axleproject.eu.
Other funding for PostgreSQL development comes from users who directly sponsor features and companies selling products and services based around PostgreSQL.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image