Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
R Data Visualization Recipes
R Data Visualization Recipes

R Data Visualization Recipes: A cookbook with 65+ data visualization recipes for smarter decision-making

Arrow left icon
Profile Icon Bianchi Lanzetta
Arrow right icon
R$50 per month
Full star icon Full star icon Full star icon Full star icon Empty star icon 4 (1 Ratings)
Paperback Nov 2017 366 pages 1st Edition
eBook
R$80 R$147.99
Paperback
R$183.99
Subscription
Free Trial
Renews at R$50p/m
Arrow left icon
Profile Icon Bianchi Lanzetta
Arrow right icon
R$50 per month
Full star icon Full star icon Full star icon Full star icon Empty star icon 4 (1 Ratings)
Paperback Nov 2017 366 pages 1st Edition
eBook
R$80 R$147.99
Paperback
R$183.99
Subscription
Free Trial
Renews at R$50p/m
eBook
R$80 R$147.99
Paperback
R$183.99
Subscription
Free Trial
Renews at R$50p/m

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing
Table of content icon View table of contents Preview book icon Preview Book

R Data Visualization Recipes

Plotting Two Continuous Variables

In this chapter, we will cover the following recipes:

  • Plotting a basic scatterplot
  • Hacking ggvis add_axis() function to operate as a title function
  • Plotting a scatterplot with shapes and colors
  • Plotting a shape reference palette for ggplot2
  • Dealing with over-plotting, reducing points
  • Dealing with over-plotting, jittering points
  • Dealing with over-plotting, alpha blending
  • Rug the margins using geom_rug()
  • Adding marginal histograms using ggExtra
  • Drawing marginal histogram using gridExtra
  • Crafting marginal plots with plotly
  • Adding regression lines
  • Adding quantile regression lines
  • Drawing publish-quality scatterplots

Introduction

Investigating the relationship between two variables may be much easier than investigating it for several variables simultaneously. There is one good reason for that: we can visualize bivariate relationships way better. Problems with numerous amount of variables are often split into several problems with only two variables.

There are several visualizations that supports the two variables context. The most popular of them may be the scatterplots. People are familiar with them, on the other hand there is a problem that haunts many scatterplots: over-plottingting. This chapter begins with recipes to draw very simple scatterplots, going all the way to explore solutions available when it comes to deal with over-plottingting, and demonstrate how enhance scattersplots by setting up marginal graphics.

Plotting a basic scatterplot

Scatterplots play a major role in the representation of two continuous variables. Making simple scatterplots is a very easy task to handle using ggplot2, ggvis, or plotly. This recipe uses a data frame called iris to draw plots, it comes with base R (datasets package).

Before using data coming from a package, you may want to try entering ?<package name>::<data frame name> into your console. For this recipe, that would go as: ?datasets::iris. This is may lead you towards data documentation, this way you get to know each variable coming from the data frame.

From the various features presented by this data set, this recipe uses Petal.Width and Petal.Length. They respectively account for iris' petal widths and lengths measured in centimeters. Besides drawing the plots, this recipe also teaches how to add a title to them...

Hacking ggvis add_axis() function to operate as a title function

Version 0.4.3 of ggvis does not have a function to add titles to plots, but still there is a known way to hack the add_axis() function to work as a title function. If a user expects to explore this device many times, it's advised to wrap it into a function. Besides making the code more readable, it's a quicker way to address the problem.

Getting ready

This recipe does not only teach how to craft the hack function but also to experiment it on the previous plot, so make sure to have sca3 from the earlier recipe loaded into your environment. Alternatively, you can use another ggvis object of your own.

...

Plotting a scatterplot with shapes and colors

There are several aesthetics coming out from geom_points() that can be changed. Typing ?geom_point into the R console will take you to the function documentation, which comes with a complete list of aesthetics understood by the function. The mandatory ones come in bold.

Names given are nothing but self-explanatory. Besides the mandatory x and y values, optional values range from alpha to stroke. For this particular recipe, we're settling for changes in the shape and colours arguments. Recipe  also aims for similar results using both ggvis and plotly

How to do it...

  1. Change the shape and colour arguments to get a better result:
> library...

Plotting a shape reference palette for ggplot2

Shapes are picked following a default scale when you input a variable to work as shape using ggplot2. You can always choose to tweak this scale to one of your preference. To do so you need to know which shapes are available and how you can call for them. This recipe simply draws the following shape palette:

Figure 2.4 - ggplot2 shape palette.

It shows available points, plus the number used to call for them. Now let's explore the code that built it.

How to do it...

Draw a suitable data frame using rep() and seq() functions, them let's plot those using geom_point():

> palette <- data.frame(x = rep(seq(1,5,1),5))
> palette$y <- c(rep(5,5),rep(4,5)...

Introduction


Investigating the relationship between two variables may be much easier than investigating it for several variables simultaneously. There is one good reason for that: we can visualize bivariate relationships way better. Problems with numerous amount of variables are often split into several problems with only two variables.

There are several visualizations that supports the two variables context. The most popular of them may be the scatterplots. People are familiar with them, on the other hand there is a problem that haunts many scatterplots: over-plottingting. This chapter begins with recipes to draw very simple scatterplots, going all the way to explore solutions available when it comes to deal with over-plottingting, and demonstrate how enhance scattersplots by setting up marginal graphics.

Plotting a basic scatterplot


Scatterplots play a major role in the representation of two continuous variables. Making simple scatterplots is a very easy task to handle using ggplot2, ggvis, or plotly. This recipe uses a data frame called iris to draw plots, it comes with base R (datasets package).

Note

Before using data coming from a package, you may want to try entering ?<package name>::<data frame name> into your console. For this recipe, that would go as: ?datasets::iris. This is may lead you towards data documentation, this way you get to know each variable coming from the data frame.

From the various features presented by this data set, this recipe uses Petal.Width and Petal.Length. They respectively account for iris' petal widths and lengths measured in centimeters. Besides drawing the plots, this recipe also teaches how to add a title to them. So, move on to the coding!

How to do it...

  1. Initialize a ggplot and then give it the point geometry:
> library(ggplot2)
> sca1 ...

Hacking ggvis add_axis() function to operate as a title function


Version 0.4.3 of ggvisdoes not have a function to add titles to plots, but still there is a known way to hack the add_axis() function to work as a title function. If a user expects to explore this device many times, it's advised to wrap it into a function. Besides making the code more readable, it's a quicker way to address the problem.

Getting ready

This recipe does not only teach how to craft the hack function but also to experiment it on the previous plot, so make sure to have sca3 from the earlier recipe loaded into your environment. Alternatively, you can use another ggvis object of your own.

How to do it...

  1. Wrap the add_axis() function with several arguments declared to work as a title function:
> library(ggvis)
> ggvis_title <- function(vis, plot_title, title_size = 18, shift = 0, ...){ 
    add_axis(vis, 'x', ticks = 0, orient = 'top', 
             properties = axis_props( axis = list(strokeWidth = 0),
       ...

Plotting a scatterplot with shapes and colors


There are several aesthetics coming out from geom_points() that can be changed. Typing ?geom_point into the R console will take you to the function documentation, which comes with a complete list of aesthetics understood by the function. The mandatory ones come in bold.

Names given are nothing but self-explanatory. Besides the mandatory x and y values, optional values range from alpha to stroke. For this particular recipe, we're settling for changes in the shape and colours arguments. Recipe  also aims for similar results using both ggvis and plotly

How to do it...

  1. Change the shape and colour arguments to get a better result:
> library(ggplot2)
> sca1 <- ggplot(data = iris, aes(x = Petal.Length, y = Petal.Width))
> sca1 + geom_point(aes(shape = Species, colour = Species))

Now each iris species is designated by a unique combination of shapes and colors:

Figure 2.3 - Adding shapes and colors to a scatter plot.

  1. plotly can also handle such...

Plotting a shape reference palette for ggplot2


Shapes are picked following a default scale when you input a variable to work as shape using ggplot2You can always choose to tweak this scale to one of your preference. To do so you need to know which shapes are available and how you can call for them. This recipe simply draws the following shape palette:

Figure 2.4 - ggplot2 shape palette.

It shows available points, plus the number used to call for them. Now let's explore the code that built it.

How to do it...

Draw a suitable data frame using rep() and seq() functions, them let's plot those using geom_point():

> palette <- data.frame(x = rep(seq(1,5,1),5))
> palette$y <- c(rep(5,5),rep(4,5),rep(3,5),rep(2,5),rep(1,5))
> library(ggplot2)
> ggplot(data = palette,aes(x,y)) +
    geom_point(shape = seq(1,25,1), size = 10, fill ='white') +
    scale_size(range = c(2, 10)) +
    geom_text(nudge_y = .3, label = seq(1,25,1))

Function geom_text() is plotting the reference numbers related...

Dealing with over-plotting, reducing points


There are mainly three techniques used to deal with over-plot. They are: (i) adopting smaller points,(ii) jittering data, and (iii) alpha blending. These are useful tools, not only to deal with over-plot but also to check if there is over-plotting.

However, these are not the only options; for example, alternative geometries can also be implemented. No matter how troublesome over-plotting may be there are good solutions available.There is not a single solution that is better for all the situations, so you must know a bunch of them. 

This recipe advises how to apply a technique based on point size reduction using ggplot2, ggvis and plotly. In order to do so, we are trusting the ggplot2::diamonds data frame. Keep in mind that reducing points works better for cases where points are very close to each other but do not actually occupy the same coordinates.

How to do it...

  1. Set shape to '.' in order to reduce points using ggplot2:
> library(ggplot2)
&gt...

Dealing with over-plotting, jittering points


Size reduction is never an option when there are too many points sharing the exact same coordinates; it simply is not the right tool for the job. A clear option therefore is to jitter the data, that is, add a little noise to the data so that the points move around a little bit and the over-plotting kind of wears off.

Two points must be highlighted here. Jittering may be a good way to adjust the plot but not to adjust the data, so do not use jittered data for modeling and always be honest when transformations of that nature take place. Second point is that as long it may work pretty well when many points share coordinates. Although, if too many points are only close enough but do no share same coordinates there is a chance that jittering will work very badly.

Now let's go back to the iris data set and demonstrate how this technique can be applied using ggplot2, ggvis and plotly.

How to do it...

  1. With ggplot2, set potion = 'jitter' in order to obtain...

Dealing with over-plotting, alpha blending


Another popular technique is known as alpha blending. It consists on making points translucent, this way the audience gets to know if points are stacked or not.  Between all the techniques demonstrated so far, alpha blending must be the most popular one. This recipe teaches how to apply alpha blending using ggplot2, plotly, and ggvis.

How to do it...

  1. Set the alpha parameter to apply alpha blending to ggplot:
> library(ggplot2)
> sca1 <- ggplot( iris, aes( x = Petal.Length, y = Petal.Width))
> sca1 + geom_point( alpha = .5 , 
                     aes(shape = Species, colour = Species))

The following figure 2.7 shows alpha blending working:

Figure 2.7 - alpha blending with ggplot2.

  1. Setting alpha parameter with plotly will also apply alpha blending:
> library(plotly)
> sca9 <- plot_ly( iris, x = ~Petal.Length, y = ~Petal.Width, 
>                  type = 'scatter', mode = 'markers', alpha = .5, symbol = ~Species)
> sca9
  1. ggvis applies...

Rug the margins using geom_rug()


Up till now, the chapter has focused on how to draw scatterplots and solutions related to over-plotting. Upcoming recipes, including this one, shall focus on enhancing scatterplots. If there is a bivariate relation to be displayed there is also two univariate distributions to show. How can they be used to improve the plots? 

Answer lies in filling the margins with supplemental plots carrying representations of underlying univariate distributions. Still relying on the iris data set framework, this recipe introduces a simple solution, almost restricted to ggplot2. Let's rug plots in the margins with geom_rug().

How to do it...

  1. Draw a scatterplot using ggplot2 and sum the geom_rug() layer:
> set.seed(50) ; library(ggplot2)
> rug <- ggplot(iris,
                aes(x = Petal.Length, 
                    y = Petal.Width, 
                    colour = Species))
> rug <- rug +
    geom_jitter(aes( shape = Species), alpha = .4) +
    geom_rug(position...

Adding marginal histograms using ggExtra


Another way to go is to draw histograms or even density distributions in the margins. Drawing tailor made plots in the margins would require more code. On the other hand, if there is no need for greater customization ggExtra package can be used to spare many code lines. This recipe is demonstrating how to use ggExtra to easily draw histograms in the margins of a scatterplot. 

Getting ready

In order to properly execute this recipe, the ggExtra package must be locked and loaded. Run the following code to make that happen:

> if( !require(ggExtra)){ install.packages('ggExtra')}

Once ggExtra is installed we can go on.

How to do it...

  1. Draw a ggplot2 scatterplot like this:
library(ggplot2)
base_p <- ggplot(iris, aes(x = Petal.Length, y = Petal.Width, colour = Species))
scatter <- base_p + geom_point( alpha = .5, aes(shape = Species)) + 
  geom_rug(alpha = .5, sides = 'tr', show.legend = F) +
  theme(legend.position = 'bottom')
  1. Load ggExtra and input the...
Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Use R's popular packages—such as ggplot2, ggvis, ggforce, and more—to create custom, interactive visualization solutions.
  • Create, design, and build interactive dashboards using Shiny
  • A highly practical guide to help you get to grips with the basics of data visualization techniques, and how you can implement them using R

Description

R is an open source language for data analysis and graphics that allows users to load various packages for effective and better data interpretation. Its popularity has soared in recent years because of its powerful capabilities when it comes to turning different kinds of data into intuitive visualization solutions. This book is an update to our earlier R data visualization cookbook with 100 percent fresh content and covering all the cutting edge R data visualization tools. This book is packed with practical recipes, designed to provide you with all the guidance needed to get to grips with data visualization using R. It starts off with the basics of ggplot2, ggvis, and plotly visualization packages, along with an introduction to creating maps and customizing them, before progressively taking you through various ggplot2 extensions, such as ggforce, ggrepel, and gganimate. Using real-world datasets, you will analyze and visualize your data as histograms, bar graphs, and scatterplots, and customize your plots with various themes and coloring options. The book also covers advanced visualization aspects such as creating interactive dashboards using Shiny By the end of the book, you will be equipped with key techniques to create impressive data visualizations with professional efficiency and precision.

Who is this book for?

If you are looking to create custom data visualization solutions using the R programming language and are stuck somewhere in the process, this book will come to your rescue. Prior exposure to packages such as ggplot2 would be useful but not necessary. However, some R programming knowledge is required.

What you will learn

  • Get to know various data visualization libraries available in R to represent data
  • Generate elegant codes to craft graphics using ggplot2, ggvis and plotly
  • Add elements, text, animation, and colors to your plot to make sense of data
  • Deepen your knowledge by adding bar-charts, scatterplots, and time series plots using ggplot2
  • Build interactive dashboards using Shiny.
  • Color specific map regions based on the values of a variable in your data frame
  • Create high-quality journal-publishable scatterplots
  • Create and design various three-dimensional and multivariate plots

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Nov 22, 2017
Length: 366 pages
Edition : 1st
Language : English
ISBN-13 : 9781788398312
Category :
Languages :
Tools :

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing

Product Details

Publication date : Nov 22, 2017
Length: 366 pages
Edition : 1st
Language : English
ISBN-13 : 9781788398312
Category :
Languages :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
R$50 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
R$500 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just R$25 each
Feature tick icon Exclusive print discounts
R$800 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just R$25 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total R$ 763.97
R Data Visualization Recipes
R$183.99
R Data Mining
R$272.99
R Data Analysis Projects
R$306.99
Total R$ 763.97 Stars icon
Banner background image

Table of Contents

12 Chapters
Installation and Introduction Chevron down icon Chevron up icon
Plotting Two Continuous Variables Chevron down icon Chevron up icon
Plotting a Discrete Predictor and a Continuous Response Chevron down icon Chevron up icon
Plotting One Variable Chevron down icon Chevron up icon
Making Other Bivariate Plots Chevron down icon Chevron up icon
Creating Maps Chevron down icon Chevron up icon
Faceting Chevron down icon Chevron up icon
Designing Three-Dimensional Plots Chevron down icon Chevron up icon
Using Theming Packages Chevron down icon Chevron up icon
Designing More Specialized Plots Chevron down icon Chevron up icon
Making Interactive Plots Chevron down icon Chevron up icon
Building Shiny Dashboards Chevron down icon Chevron up icon

Customer reviews

Rating distribution
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
(1 Ratings)
5 star 0%
4 star 100%
3 star 0%
2 star 0%
1 star 0%
Amazon Customer Jan 16, 2018
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
A clear and compelling read, and a great way to get a handle on the major methods of data visualization in R.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is included in a Packt subscription? Chevron down icon Chevron up icon

A subscription provides you with full access to view all Packt and licnesed content online, this includes exclusive access to Early Access titles. Depending on the tier chosen you can also earn credits and discounts to use for owning content

How can I cancel my subscription? Chevron down icon Chevron up icon

To cancel your subscription with us simply go to the account page - found in the top right of the page or at https://subscription.packtpub.com/my-account/subscription - From here you will see the ‘cancel subscription’ button in the grey box with your subscription information in.

What are credits? Chevron down icon Chevron up icon

Credits can be earned from reading 40 section of any title within the payment cycle - a month starting from the day of subscription payment. You also earn a Credit every month if you subscribe to our annual or 18 month plans. Credits can be used to buy books DRM free, the same way that you would pay for a book. Your credits can be found in the subscription homepage - subscription.packtpub.com - clicking on ‘the my’ library dropdown and selecting ‘credits’.

What happens if an Early Access Course is cancelled? Chevron down icon Chevron up icon

Projects are rarely cancelled, but sometimes it's unavoidable. If an Early Access course is cancelled or excessively delayed, you can exchange your purchase for another course. For further details, please contact us here.

Where can I send feedback about an Early Access title? Chevron down icon Chevron up icon

If you have any feedback about the product you're reading, or Early Access in general, then please fill out a contact form here and we'll make sure the feedback gets to the right team. 

Can I download the code files for Early Access titles? Chevron down icon Chevron up icon

We try to ensure that all books in Early Access have code available to use, download, and fork on GitHub. This helps us be more agile in the development of the book, and helps keep the often changing code base of new versions and new technologies as up to date as possible. Unfortunately, however, there will be rare cases when it is not possible for us to have downloadable code samples available until publication.

When we publish the book, the code files will also be available to download from the Packt website.

How accurate is the publication date? Chevron down icon Chevron up icon

The publication date is as accurate as we can be at any point in the project. Unfortunately, delays can happen. Often those delays are out of our control, such as changes to the technology code base or delays in the tech release. We do our best to give you an accurate estimate of the publication date at any given time, and as more chapters are delivered, the more accurate the delivery date will become.

How will I know when new chapters are ready? Chevron down icon Chevron up icon

We'll let you know every time there has been an update to a course that you've bought in Early Access. You'll get an email to let you know there has been a new chapter, or a change to a previous chapter. The new chapters are automatically added to your account, so you can also check back there any time you're ready and download or read them online.

I am a Packt subscriber, do I get Early Access? Chevron down icon Chevron up icon

Yes, all Early Access content is fully available through your subscription. You will need to have a paid for or active trial subscription in order to access all titles.

How is Early Access delivered? Chevron down icon Chevron up icon

Early Access is currently only available as a PDF or through our online reader. As we make changes or add new chapters, the files in your Packt account will be updated so you can download them again or view them online immediately.

How do I buy Early Access content? Chevron down icon Chevron up icon

Early Access is a way of us getting our content to you quicker, but the method of buying the Early Access course is still the same. Just find the course you want to buy, go through the check-out steps, and you’ll get a confirmation email from us with information and a link to the relevant Early Access courses.

What is Early Access? Chevron down icon Chevron up icon

Keeping up to date with the latest technology is difficult; new versions, new frameworks, new techniques. This feature gives you a head-start to our content, as it's being created. With Early Access you'll receive each chapter as it's written, and get regular updates throughout the product's development, as well as the final course as soon as it's ready.We created Early Access as a means of giving you the information you need, as soon as it's available. As we go through the process of developing a course, 99% of it can be ready but we can't publish until that last 1% falls in to place. Early Access helps to unlock the potential of our content early, to help you start your learning when you need it most. You not only get access to every chapter as it's delivered, edited, and updated, but you'll also get the finalized, DRM-free product to download in any format you want when it's published. As a member of Packt, you'll also be eligible for our exclusive offers, including a free course every day, and discounts on new and popular titles.