Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
SQL for Data Analytics

You're reading from   SQL for Data Analytics Harness the power of SQL to extract insights from data

Arrow left icon
Product type Paperback
Published in Aug 2022
Publisher Packt
ISBN-13 9781801812870
Length 540 pages
Edition 3rd Edition
Languages
Arrow right icon
Authors (4):
Arrow left icon
Benjamin Johnston Benjamin Johnston
Author Profile Icon Benjamin Johnston
Benjamin Johnston
Matt Goldwasser Matt Goldwasser
Author Profile Icon Matt Goldwasser
Matt Goldwasser
Jun Shan Jun Shan
Author Profile Icon Jun Shan
Jun Shan
Upom Malik Upom Malik
Author Profile Icon Upom Malik
Upom Malik
Arrow right icon
View More author details
Toc

Table of Contents (11) Chapters Close

Preface 1. Understanding and Describing Data 2. The Basics of SQL for Analytics FREE CHAPTER 3. SQL for Data Preparation 4. Aggregate Functions for Data Analysis 5. Window Functions for Data Analysis 6. Importing and Exporting Data 7. Analytics Using Complex Data Types 8. Performant SQL 9. Using SQL to Uncover the Truth: A Case Study Appendix

Column Constraints

Column constraints are keywords that help you specify the properties you want to attribute to a particular column. In other words, you can ensure that all the rows in that column adhere to your specified constraint. Some major column constraints are as follows:

  • NOT NULL: This constraint guarantees that no value in a column can be NULL.
  • UNIQUE: This constraint guarantees that every single row for a column has a unique value and that no value is repeated.
  • PRIMARY KEY: This is a special constraint that is unique for each row and helps you to find a specific row more quickly. If the primary key of this table contains only one column, you can add this PRIMARY KEY constraint to the column definition of the primary key column. If the primary key of this table consists of multiple columns, you need to use a table constraint to define the key in the CREATE statement.

Simple CREATE Statement

Now that you know about data types and column constraints, you can start creating your first table. Suppose you want to create a table called state_populations with columns for the initials and populations of states. The query would look as follows:

CREATE TABLE state_populations (
  state VARCHAR(2) PRIMARY KEY,
  population NUMERIC
);

Once you execute this statement, you can run a simple SELECT statement to verify that the table is created. However, you cannot see any row in the output as you have not run any statements to populate it.

Figure 2.32: Simple CREATE statement

Figure 2.32: Simple CREATE statement

Note

Sometimes, you may run a CREATE TABLE query and get the error relation {table_name} already exists. This simply means that a table with the same name already exists. You either must delete the table with the same name or change the name of your table. You will learn how to delete a table later in this chapter.

You will soon be exploring the second way to create a table, which is by using a SQL query. But first, you will do an exercise to create a blank table in SQL.

Exercise 2.03: Creating a Table in SQL

In this exercise, you will create a table using the CREATE TABLE statement. The marketing team at ZoomZoom would like to create a table called countries to analyze the data of different countries. It should have four columns: an integer key column, a unique name column, a founding year column, and a capital column.

Follow these steps to complete the exercise:

  1. Open pgAdmin, connect to the sqlda database, and open SQL query editor.
  2. Execute the following query to drop the countries table since it already exists in the database:
    DROP TABLE IF EXISTS countries;
  3. Run the following query to create the countries table:
    CREATE TABLE countries ( 
      key INT PRIMARY KEY,
      name text UNIQUE, 
      founding_year INT, 
      capital text
    );

You should get a result message as follows, which indicates the creation of a blank table:

Figure 2.33: CREATE statement for the countries table

Figure 2.33: CREATE statement for the countries table

Note

To access the source code for this specific section, please refer to https://packt.link/COMnA.

In this exercise, you learned how to create a table using different column constraints and the CREATE TABLE statement. In the next section, you will create tables using the SELECT query.

Creating Tables with SELECT

You already know how to create a table. However, say you wanted to create a table using data from an existing table. This can be done by using a modification of the CREATE TABLE statement:

CREATE TABLE {table_name} AS (
  {select_query}
);

Here, {select_query} is any SELECT query that can be run in your database. For instance, say you wanted to create a table based on the products table that only had products from the year 2014. Suppose the title of the table is products_2014; you could write the following query:

CREATE TABLE products_2014 AS ( 
  SELECT *
FROM products 
WHERE year=2014
);

Running this SQL will yield the following result:

Figure 2.34: CREATE from a SELECT query

Figure 2.34: CREATE from a SELECT query

This can be done with any query, and the table will inherit all the properties of the output query.

PostgreSQL also provides another way to create a table from a query, which utilizes a SELECT … INTO … syntax. An example of this syntax is shown below:

SELECT * 
INTO products_2014
FROM products 
WHERE year=2014;

Note

Before running this query, please check the table list in the sqlda database and make sure this table does not exist. If it does, please drop the table from the console.

This query achieves the same result as the CREATE … AS statement. In this book, you will use the CREATE … AS statement because the syntax inside the parenthesis is a complete SELECT statement, thus it is easier to create and modify the query without changing the structure of the statement. You can choose either based on your personal preference.

One issue with creating a table with a query is that the data types of the query are not explicitly specified and can be confusing. Luckily, PostgreSQL stores the table definitions in a set of system tables, and you can read the table definition from the system tables. For example, to check the column definitions of the products_2014 table, you can run the following SQL:

SELECT COLUMN_NAME, DATA_TYPE 
FROM INFORMATION_SCHEMA.COLUMNS 
WHERE TABLE_NAME = 'products_2014';

From the result, you can identify all the columns and their data types in the products_2014 table:

Figure 2.35: Query table definition from information schema

Figure 2.35: Query table definition from information schema

You have been reading a chapter from
SQL for Data Analytics - Third Edition
Published in: Aug 2022
Publisher: Packt
ISBN-13: 9781801812870
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image