Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Practical Predictive Analytics

You're reading from   Practical Predictive Analytics Analyse current and historical data to predict future trends using R, Spark, and more

Arrow left icon
Product type Paperback
Published in Jun 2017
Publisher Packt
ISBN-13 9781785886188
Length 576 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Author (1):
Arrow left icon
Ralph Winters Ralph Winters
Author Profile Icon Ralph Winters
Ralph Winters
Arrow right icon
View More author details
Toc

Table of Contents (13) Chapters Close

Preface 1. Getting Started with Predictive Analytics FREE CHAPTER 2. The Modeling Process 3. Inputting and Exploring Data 4. Introduction to Regression Algorithms 5. Introduction to Decision Trees, Clustering, and SVM 6. Using Survival Analysis to Predict and Analyze Customer Churn 7. Using Market Basket Analysis as a Recommender Engine 8. Exploring Health Care Enrollment Data as a Time Series 9. Introduction to Spark Using R 10. Exploring Large Datasets Using Spark 11. Spark Machine Learning - Regression and Cluster Models 12. Spark Models – Rule-Based Learning

R packages

An R package extends the functionality of basic R. Base R, by itself, is very capable, and you can do an incredible amount of analytics without adding any additional packages. However adding a package may be beneficial if it adds a functionality which does not exist in base R, improves or builds upon an existing functionality, or just makes something that you can already do easier.

For example, there are no built in packages in base R which enable you to perform certain types of machine learning (such as Random Forests). As a result, you need to search for an add on package which performs this functionality. Fortunately you are covered. There are many packages available which implement this algorithm.

Bear in mind that there are always new packages coming out. I tend to favor packages which have been on CRAN for a long time and have large user base. When installing something new, I will try to reference the results against other packages which do similar things. Speed is another reason to consider adopting a new package.

The stargazer package

For an example of a package which can just make life easier, first lets consider the output produced by running a summary function on the regression results, as we did previously. You can run it again if you wish.

summary(lm_output)

The amount of statistical information output by the summary() function can be overwhelming to the initiated. This is not only related to the amount of output, but the formatting. That is why I did not show the entire output in the previous example.

One way to make output easier to look at is to first reduce the amount of output that is presented, and then reformat it so it is easier on the eyes.

To accomplish this, we can utilize a package called stargazer, which will reformat the large volume of output produced by summary() function and simplify the presentations. Stargazer excels at reformatting the output of many regression models, and displaying the results as HTML, PDF, Latex, or as simple formatted text. By default, it will show you the most important statistical output for various models, and you can always specify the types of statistical output that you want to see.

To obtain more information on the stargazer package you can first go to CRAN, and search for documentation about stargazer package, and/or you can use the R help system:

IF you already have installed stargazer you can use the following command:

packageDescription("stargazer")

If you havent installed the package, information about stargazer, (or other packages) can also be found using R specific internet searches:

RSiteSearch("stargazer")

If you like searching for documentation within R, you can obtain more information about the R help system at:

https://www.r-project.org/help.html

Installing stargazer package

Now, on to installing stargazer:

  • First create a new R script (File | New File | R Script).
  • Enter the following lines and then select Source from the menu bar in the code pane, which will submit the entire script:
        install.packages("stargazer") 
library(stargazer)
stargazer(lm_output, , type="text")

After the script has been run, the following should appear in the Console:

Code description

Here is a line by line description of the code which you have just run:

  • install.packages("stargazer"): The line will install the package to the default package directory on your machine. If you will be rerunning this code again, you can comment out this line, since the package will have already be installed in your R repository.
  • library(stargazer): Installing a package does not make the package automatically available. You need to run a library (or require()) function in order to actually load the stargazer package.
  • stargazer(lm_output, , type="text"): This line will take the output object lm_output, that was created in the first script, condense the output, and write it out to the console in a simpler, more readable format. There are many other options in the stargazer library, which will format the output as HTML, or Latex.

Please refer to the reference manual at https://cran.r-project.org/web/packages/stargazer/index.html for more information.

The reformatted results will appear in the R Console. As you can see, the output written to the console is much cleaner and easier to read.

Saving your work

After you are done, select File | File Save from the menu bar.

Then navigate to the PracticalPredictiveAnalytics/Outputs folder that was created, and name it Chapter1_LinearRegressionOutput. Press Save.

You have been reading a chapter from
Practical Predictive Analytics
Published in: Jun 2017
Publisher: Packt
ISBN-13: 9781785886188
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image