Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Learning Bayesian Models with R
Learning Bayesian Models with R

Learning Bayesian Models with R: Become an expert in Bayesian Machine Learning methods using R and apply them to solve real-world big data problems

Arrow left icon
Profile Icon Hari Manassery Koduvely
Arrow right icon
$19.99 per month
Full star icon Full star icon Full star icon Half star icon Empty star icon 3.4 (7 Ratings)
Paperback Oct 2015 168 pages 1st Edition
eBook
$20.98 $29.99
Paperback
$38.99
Subscription
Free Trial
Renews at $19.99p/m
Arrow left icon
Profile Icon Hari Manassery Koduvely
Arrow right icon
$19.99 per month
Full star icon Full star icon Full star icon Half star icon Empty star icon 3.4 (7 Ratings)
Paperback Oct 2015 168 pages 1st Edition
eBook
$20.98 $29.99
Paperback
$38.99
Subscription
Free Trial
Renews at $19.99p/m
eBook
$20.98 $29.99
Paperback
$38.99
Subscription
Free Trial
Renews at $19.99p/m

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing
Table of content icon View table of contents Preview book icon Preview Book

Learning Bayesian Models with R

Chapter 1. Introducing the Probability Theory

Bayesian inference is a method of learning about the relationship between variables from data, in the presence of uncertainty, in real-world problems. It is one of the frameworks of probability theory. Any reader interested in Bayesian inference should have a good knowledge of probability theory to understand and use Bayesian inference. This chapter covers an overview of probability theory, which will be sufficient to understand the rest of the chapters in this book.

It was Pierre-Simon Laplace who first proposed a formal definition of probability with mathematical rigor. This definition is called the Classical Definition and it states the following:

 

The theory of chance consists in reducing all the events of the same kind to a certain number of cases equally possible, that is to say, to such as we may be equally undecided about in regard to their existence, and in determining the number of cases favorable to the event whose probability is sought. The ratio of this number to that of all the cases possible is the measure of this probability, which is thus simply a fraction whose numerator is the number of favorable cases and whose denominator is the number of all the cases possible.

 
 --Pierre-Simon Laplace, A Philosophical Essay on Probabilities

What this definition means is that, if a random experiment can result in Introducing the Probability Theory mutually exclusive and equally likely outcomes, the probability of the event Introducing the Probability Theory is given by:

Introducing the Probability Theory

Here, Introducing the Probability Theory is the number of occurrences of the event Introducing the Probability Theory.

To illustrate this concept, let us take a simple example of a rolling dice. If the dice is a fair dice, then all the faces will have an equal chance of showing up when the dice is rolled. Then, the probability of each face showing up is 1/6. However, when one rolls the dice 100 times, all the faces will not come in equal proportions of 1/6 due to random fluctuations. The estimate of probability of each face is the number of times the face shows up divided by the number of rolls. As the denominator is very large, this ratio will be close to 1/6.

In the long run, this classical definition treats the probability of an uncertain event as the relative frequency of its occurrence. This is also called a frequentist approach to probability. Although this approach is suitable for a large class of problems, there are cases where this type of approach cannot be used. As an example, consider the following question: Is Pataliputra the name of an ancient city or a king? In such cases, we have a degree of belief in various plausible answers, but it is not based on counts in the outcome of an experiment (in the Sanskrit language Putra means son, therefore some people may believe that Pataliputra is the name of an ancient king in India, but it is a city).

Another example is, What is the chance of the Democratic Party winning the election in 2016 in America? Some people may believe it is 1/2 and some people may believe it is 2/3. In this case, probability is defined as the degree of belief of a person in the outcome of an uncertain event. This is called the subjective definition of probability.

One of the limitations of the classical or frequentist definition of probability is that it cannot address subjective probabilities. As we will see later in this book, Bayesian inference is a natural framework for treating both frequentist and subjective interpretations of probability.

Probability distributions

In both classical and Bayesian approaches, a probability distribution function is the central quantity, which captures all of the information about the relationship between variables in the presence of uncertainty. A probability distribution assigns a probability value to each measurable subset of outcomes of a random experiment. The variable involved could be discrete or continuous, and univariate or multivariate. Although people use slightly different terminologies, the commonly used probability distributions for the different types of random variables are as follows:

  • Probability mass function (pmf) for discrete numerical random variables
  • Categorical distribution for categorical random variables
  • Probability density function (pdf) for continuous random variables

One of the well-known distribution functions is the normal or Gaussian distribution, which is named after Carl Friedrich Gauss, a famous German mathematician and physicist. It is also known by the name bell curve because of its shape. The mathematical form of this distribution is given by:

Probability distributions

Here, Probability distributions is the mean or location parameter and Probability distributions is the standard deviation or scale parameter (Probability distributions is called variance). The following graphs show what the distribution looks like for different values of location and scale parameters:

Probability distributions

One can see that as the mean changes, the location of the peak of the distribution changes. Similarly, when the standard deviation changes, the width of the distribution also changes.

Many natural datasets follow normal distribution because, according to the central limit theorem, any random variable that can be composed as a mean of independent random variables will have a normal distribution. This is irrespective of the form of the distribution of this random variable, as long as they have finite mean and variance and all are drawn from the same original distribution. A normal distribution is also very popular among data scientists because in many statistical inferences, theoretical results can be derived if the underlying distribution is normal.

Now, let us look at the multidimensional version of normal distribution. If the random variable is an N-dimensional vector, x is denoted by:

Probability distributions

Then, the corresponding normal distribution is given by:

Probability distributions

Here, Probability distributions corresponds to the mean (also called location) and Probability distributions is an N x N covariance matrix (also called scale).

To get a better understanding of the multidimensional normal distribution, let us take the case of two dimensions. In this case, Probability distributions and the covariance matrix is given by:

Probability distributions

Here, Probability distributions and Probability distributions are the variances along Probability distributions and Probability distributions directions, and Probability distributions is the correlation between Probability distributions and Probability distributions. A plot of two-dimensional normal distribution for Probability distributions, Probability distributions, and Probability distributions is shown in the following image:

Probability distributions

If Probability distributions, then the two-dimensional normal distribution will be reduced to the product of two one-dimensional normal distributions, since Probability distributions would become diagonal in this case. The following 2D projections of normal distribution for the same values of Probability distributions and Probability distributions but with Probability distributions and Probability distributions illustrate this case:

Probability distributions

The high correlation between x and y in the first case forces most of the data points along the 45 degree line and makes the distribution more anisotropic; whereas, in the second case, when the correlation is zero, the distribution is more isotropic.

We will briefly review some of the other well-known distributions used in Bayesian inference here.

Conditional probability

Often, one would be interested in finding the probability of the occurrence of a set of random variables when other random variables in the problem are held fixed. As an example of population health study, one would be interested in finding what is the probability of a person, in the age range 40-50, developing heart disease with high blood pressure and diabetes. Questions such as these can be modeled using conditional probability, which is defined as the probability of an event, given that another event has happened. More formally, if we take the variables A and B, this definition can be rewritten as follows:

Conditional probability

Similarly:

Conditional probability

The following Venn diagram explains the concept more clearly:

Conditional probability

In Bayesian inference, we are interested in conditional probabilities corresponding to multivariate distributions. If Conditional probability denotes the entire random variable set, then the conditional probability of Conditional probability, given that Conditional probability is fixed at some value, is given by the ratio of joint probability of Conditional probability and joint probability of Conditional probability:

Conditional probability

In the case of two-dimensional normal distribution, the conditional probability of interest is as follows:

Conditional probability

It can be shown that (exercise 2 in the Exercises section of this chapter) the RHS can be simplified, resulting in an expression for Conditional probability in the form of a normal distribution again with the mean Conditional probability and variance Conditional probability.

Bayesian theorem

From the definition of the conditional probabilities Bayesian theorem and Bayesian theorem, it is easy to show the following:

Bayesian theorem

Rev. Thomas Bayes (1701–1761) used this rule and formulated his famous Bayes theorem that can be interpreted if Bayesian theorem represents the initial degree of belief (or prior probability) in the value of a random variable A before observing B; then, its posterior probability or degree of belief after accounted for B will get updated according to the preceding equation. So, the Bayesian inference essentially corresponds to updating beliefs about an uncertain system after having made some observations about it. In the sense, this is also how we human beings learn about the world. For example, before we visit a new city, we will have certain prior knowledge about the place after reading from books or on the Web.

However, soon after we reach the place, this belief will get updated based on our initial experience of the place. We continuously update the belief as we explore the new city more and more. We will describe Bayesian inference more in detail in Chapter 3, Introducing Bayesian Inference.

Marginal distribution

In many situations, we are interested only in the probability distribution of a subset of random variables. For example, in the heart disease problem mentioned in the previous section, if we want to infer the probability of people in a population having a heart disease as a function of their age only, we need to integrate out the effect of other random variables such as blood pressure and diabetes. This is called marginalization:

Marginal distribution

Or:

Marginal distribution

Note that marginal distribution is very different from conditional distribution. In conditional probability, we are finding the probability of a subset of random variables with values of other random variables fixed (conditioned) at a given value. In the case of marginal distribution, we are eliminating the effect of a subset of random variables by integrating them out (in the sense averaging their effect) from the joint distribution. For example, in the case of two-dimensional normal distribution, marginalization with respect to one variable will result in a one-dimensional normal distribution of the other variable, as follows:

Marginal distribution

The details of this integration is given as an exercise (exercise 3 in the Exercises section of this chapter).

Expectations and covariance

Having known the distribution of a set of random variables Expectations and covariance, what one would be typically interested in for real-life applications is to be able to estimate the average values of these random variables and the correlations between them. These are computed formally using the following expressions:

Expectations and covariance
Expectations and covariance

For example, in the case of two-dimensional normal distribution, if we are interested in finding the correlation between the variables Expectations and covariance and Expectations and covariance, it can be formally computed from the joint distribution using the following formula:

Expectations and covariance

Binomial distribution

A binomial distribution is a discrete distribution that gives the probability of heads in n independent trials where each trial has one of two possible outcomes, heads or tails, with the probability of heads being p. Each of the trials is called a Bernoulli trial. The functional form of the binomial distribution is given by:

Binomial distribution

Here, Binomial distribution denotes the probability of having k heads in n trials. The mean of the binomial distribution is given by np and variance is given by np(1-p). Have a look at the following graphs:

Binomial distribution

The preceding graphs show the binomial distribution for two values of n; 100 and 1000 for p = 0.7. As you can see, when n becomes large, the Binomial distribution becomes sharply peaked. It can be shown that, in the large n limit, a binomial distribution can be approximated using a normal distribution with mean np and variance np(1-p). This is a characteristic shared by many discrete distributions that, in the large n limit, they can be approximated by some continuous distributions.

Beta distribution

The Beta distribution denoted by Beta distribution is a function of the power of Beta distribution, and its reflection Beta distribution is given by:

Beta distribution

Here, Beta distribution are parameters that determine the shape of the distribution function and Beta distribution is the Beta function given by the ratio of Gamma functions: Beta distribution.

The Beta distribution is a very important distribution in Bayesian inference. It is the conjugate prior probability distribution (which will be defined more precisely in the next chapter) for binomial, Bernoulli, negative binomial, and geometric distributions. It is used for modeling the random behavior of percentages and proportions. For example, the Beta distribution has been used for modeling allele frequencies in population genetics, time allocation in project management, the proportion of minerals in rocks, and heterogeneity in the probability of HIV transmission.

Gamma distribution

The Gamma distribution denoted by Gamma distribution is another common distribution used in Bayesian inference. It is used for modeling the waiting times such as survival rates. Special cases of the Gamma distribution are the well-known Exponential and Chi-Square distributions.

In Bayesian inference, the Gamma distribution is used as a conjugate prior for the inverse of variance of a one-dimensional normal distribution or parameters such as the rate (Gamma distribution) of an exponential or Poisson distribution.

The mathematical form of a Gamma distribution is given by:

Gamma distribution

Here, Gamma distribution and Gamma distribution are the shape and rate parameters, respectively (both take values greater than zero). There is also a form in terms of the scale parameter Gamma distribution, which is common in econometrics. Another related distribution is the Inverse-Gamma distribution that is the distribution of the reciprocal of a variable that is distributed according to the Gamma distribution. It's mainly used in Bayesian inference as the conjugate prior distribution for the variance of a one-dimensional normal distribution.

Dirichlet distribution

The Dirichlet distribution is a multivariate analogue of the Beta distribution. It is commonly used in Bayesian inference as the conjugate prior distribution for multinomial distribution and categorical distribution. The main reason for this is that it is easy to implement inference techniques, such as Gibbs sampling, on the Dirichlet-multinomial distribution.

The Dirichlet distribution of order Dirichlet distribution is defined over an open Dirichlet distribution dimensional simplex as follows:

Dirichlet distribution

Here, Dirichlet distribution, Dirichlet distribution, and Dirichlet distribution.

Wishart distribution

The Wishart distribution is a multivariate generalization of the Gamma distribution. It is defined over symmetric non-negative matrix-valued random variables. In Bayesian inference, it is used as the conjugate prior to estimate the distribution of inverse of the covariance matrix Wishart distribution (or precision matrix) of the normal distribution. When we discussed Gamma distribution, we said it is used as a conjugate distribution for the inverse of the variance of the one-dimensional normal distribution.

The mathematical definition of the Wishart distribution is as follows:

Wishart distribution

Here, Wishart distribution denotes the determinant of the matrix Wishart distribution of dimension Wishart distribution and Wishart distribution is the degrees of freedom.

A special case of the Wishart distribution is when Wishart distribution corresponds to the well-known Chi-Square distribution function with Wishart distribution degrees of freedom.

Wikipedia gives a list of more than 100 useful distributions that are commonly used by statisticians (reference 1 in the Reference section of this chapter). Interested readers should refer to this article.

Exercises

  1. By using the definition of conditional probability, show that any multivariate joint distribution of N random variables Exercises has the following trivial factorization:
    Exercises
  2. The bivariate normal distribution is given by:
    Exercises

    Here:

    Exercises

    By using the definition of conditional probability, show that the conditional distribution Exercises can be written as a normal distribution of the form Exercises where Exercises and Exercises.

  3. By using explicit integration of the expression in exercise 2, show that the marginalization of bivariate normal distribution will result in univariate normal distribution.
  4. In the following table, a dataset containing the measurements of petal and sepal sizes of 15 different Iris flowers are shown (taken from the Iris dataset, UCI machine learning dataset repository). All units are in cms:

    Sepal Length

    Sepal Width

    Petal Length

    Petal Width

    Class of Flower

    5.1

    3.5

    1.4

    0.2

    Iris-setosa

    4.9

    3

    1.4

    0.2

    Iris-setosa

    4.7

    3.2

    1.3

    0.2

    Iris-setosa

    4.6

    3.1

    1.5

    0.2

    Iris-setosa

    5

    3.6

    1.4

    0.2

    Iris-setosa

    7

    3.2

    4.7

    1.4

    Iris-versicolor

    6.4

    3.2

    4.5

    1.5

    Iris-versicolor

    6.9

    3.1

    4.9

    1.5

    Iris-versicolor

    5.5

    2.3

    4

    1.3

    Iris-versicolor

    6.5

    2.8

    4.6

    1.5

    Iris-versicolor

    6.3

    3.3

    6

    2.5

    Iris-virginica

    5.8

    2.7

    5.1

    1.9

    Iris-virginica

    7.1

    3

    5.9

    2.1

    Iris-virginica

    6.3

    2.9

    5.6

    1.8

    Iris-virginica

    6.5

    3

    5.8

    2.2

    Iris-virginica

    Answer the following questions:

    1. What is the probability of finding flowers with a sepal length more than 5 cm and a sepal width less than 3 cm?
    2. What is the probability of finding flowers with a petal length less than 1.5 cm; given that petal width is equal to 0.2 cm?
    3. What is the probability of finding flowers with a sepal length less than 6 cm and a petal width less than 1.5 cm; given that the class of the flower is Iris-versicolor?

References

  1. http://en.wikipedia.org/wiki/List_of_probability_distributions
  2. Feller W. An Introduction to Probability Theory and Its Applications. Vol. 1. Wiley Series in Probability and Mathematical Statistics. 1968. ISBN-10: 0471257087
  3. Jayes E.T. Probability Theory: The Logic of Science. Cambridge University Press. 2003. ISBN-10: 0521592712
  4. Radziwill N.M. Statistics (The Easier Way) with R: an informal text on applied statistics. Lapis Lucera. 2015. ISBN-10: 0692339426

Summary

To summarize this chapter, we discussed elements of probability theory; particularly those aspects required for learning Bayesian inference. Due to lack of space, we have not covered many elementary aspects of this subject. There are some excellent books on this subject, for example, books by William Feller (reference 2 in the References section of this chapter), E. T. Jaynes (reference 3 in the References section of this chapter), and M. Radziwill (reference 4 in the References section of this chapter). Readers are encouraged to read these to get a more in-depth understanding of probability theory and how it can be applied in real-life situations.

In the next chapter, we will introduce the R programming language that is the most popular open source framework for data analysis and Bayesian inference in particular.

Left arrow icon Right arrow icon

Description

Bayesian Inference provides a unified framework to deal with all sorts of uncertainties when learning patterns form data using machine learning models and use it for predicting future observations. However, learning and implementing Bayesian models is not easy for data science practitioners due to the level of mathematical treatment involved. Also, applying Bayesian methods to real-world problems requires high computational resources. With the recent advances in computation and several open sources packages available in R, Bayesian modeling has become more feasible to use for practical applications today. Therefore, it would be advantageous for all data scientists and engineers to understand Bayesian methods and apply them in their projects to achieve better results. Learning Bayesian Models with R starts by giving you a comprehensive coverage of the Bayesian Machine Learning models and the R packages that implement them. It begins with an introduction to the fundamentals of probability theory and R programming for those who are new to the subject. Then the book covers some of the important machine learning methods, both supervised and unsupervised learning, implemented using Bayesian Inference and R. Every chapter begins with a theoretical description of the method explained in a very simple manner. Then, relevant R packages are discussed and some illustrations using data sets from the UCI Machine Learning repository are given. Each chapter ends with some simple exercises for you to get hands-on experience of the concepts and R packages discussed in the chapter. The last chapters are devoted to the latest development in the field, specifically Deep Learning, which uses a class of Neural Network models that are currently at the frontier of Artificial Intelligence. The book concludes with the application of Bayesian methods on Big Data using the Hadoop and Spark frameworks.

Who is this book for?

This book is for statisticians, analysts, and data scientists who want to build a Bayes-based system with R and implement it in their day-to-day models and projects. It is mainly intended for Data Scientists and Software Engineers who are involved in the development of Advanced Analytics applications. To understand this book, it would be useful if you have basic knowledge of probability theory and analytics and some familiarity with the programming language R.

What you will learn

  • Set up the R environment
  • Create a classification model to predict and explore discrete variables
  • Get acquainted with Probability Theory to analyze random events
  • Build Linear Regression models
  • Use Bayesian networks to infer the probability distribution of decision variables in a problem
  • Model a problem using Bayesian Linear Regression approach with the R package BLR
  • Use Bayesian Logistic Regression model to classify numerical data
  • Perform Bayesian Inference on massively large data sets using the MapReduce programs in R and Cloud computing

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Oct 28, 2015
Length: 168 pages
Edition : 1st
Language : English
ISBN-13 : 9781783987603
Category :
Languages :

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing

Product Details

Publication date : Oct 28, 2015
Length: 168 pages
Edition : 1st
Language : English
ISBN-13 : 9781783987603
Category :
Languages :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total $ 148.97
Machine Learning with R
$54.99
Mastering Machine Learning with R
$54.99
Learning Bayesian Models with R
$38.99
Total $ 148.97 Stars icon
Banner background image

Table of Contents

10 Chapters
1. Introducing the Probability Theory Chevron down icon Chevron up icon
2. The R Environment Chevron down icon Chevron up icon
3. Introducing Bayesian Inference Chevron down icon Chevron up icon
4. Machine Learning Using Bayesian Inference Chevron down icon Chevron up icon
5. Bayesian Regression Models Chevron down icon Chevron up icon
6. Bayesian Classification Models Chevron down icon Chevron up icon
7. Bayesian Models for Unsupervised Learning Chevron down icon Chevron up icon
8. Bayesian Neural Networks Chevron down icon Chevron up icon
9. Bayesian Modeling at Big Data Scale Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Half star icon Empty star icon 3.4
(7 Ratings)
5 star 42.9%
4 star 0%
3 star 28.6%
2 star 14.3%
1 star 14.3%
Filter icon Filter
Top Reviews

Filter reviews by




Hugo Jan 11, 2016
Full star icon Full star icon Full star icon Full star icon Full star icon 5
This book is good, have a clear reading, structured information in good steps, exercises and references, these last two are very useful when you want more detailed information. Statistics aren't easy at least for me, but I could learn the advantages of Bayesian inference.I think that the first chapters are essential introduction to the subject and the tools to work, but, the real what you really want comes in modules, first you have an understatement of the use and capabilities of principles of Bayesian inference, after that you have notion of Bayesian and R, than you start to use both in machine learning. Machine Learning have many uses, so I think that the applicability of the book tend to infinity, I really liked that the author gives base of Bayesian neural networks in chapter 8, talking about deep belief networks the advantages and like in all the other subjects he gives good references to go deep and learn for sure. You will understand wow structured is the book when you achieve the last chapter and see how much you've learned and that the complexity of your projects achieve, all the chapters are like a stair degree.My experience reading this was good because I feel that I've learned and the exercises make me work with, Bayesian inference is different from classic statistics, you can you this to solve yor project needs, I certainly recommend this book, is hard to find such information well explained like in this book.
Amazon Verified review Amazon
Duncan W. Robinson Nov 06, 2015
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Learning Bayesian Models with R is a great book for those who want a hands-on approach to Bayesian data analysis and modeling using R software. It’s really not easy to find books that provide a good introduction to Bayesian theory and methodology, tell you how this information can be used (what you can do with knowledge of this methodology), and give useful examples with R code. My favorite sections were chapters 5 through 8; these detail Bayesian regression models, Bayesian classification models, and, my favorite, Bayesian neural networks (I mean, how cool is that?). For those who program in Python, I also recommend Bayesian Methods for Hackers: Probabilistic Programming and Bayesian Inference.
Amazon Verified review Amazon
Perry Nally Jan 02, 2016
Full star icon Full star icon Full star icon Full star icon Full star icon 5
A great book about machine learning and big data processing using Bayesian statistical algorithms. I have to say that this book is not for the faint of heart. But if you want to learn something very useful in the decades to come, then this is for you, faint-hearted included. It is an intensely mathematical read, but there's no other way to portray such elegance.This book presents the equations needed without leaving anything out. There are study section at the end of each chapter to help you verify your understanding, which is a nice addition. I do recommend thoroughly understanding the first two chapters before moving on to the rest of the book as they contain critical statistical logic that is needed to understand the mathematical models used in the rest of the book.It's a great book and can be used as a resource for artificial intelligence and big data. It's also well organized with short clear details.
Amazon Verified review Amazon
Dimitri Shvorob Mar 05, 2016
Full star icon Full star icon Full star icon Empty star icon Empty star icon 3
Let me be frank: I don't like Packt, a publisher that saves money on editing and graphic design, and just keeps churning out un-edited, ugly books by authors who could not, or did not, go with a proper publisher. They give free e-books to reviewers, and many feel obliged to return the favor by posting super-positive, but detail-free "reviews", which don't mention any alternatives, and sometimes name-check a book's key terms, but in an odd way that suggests that they don't really know what those mean. Just look at the reviews, and ratings, of this book's fan Dipanjan Sarkar, for example. I have seen many Packt books, and many Dipanjans, and I am annoyed.Anyway, this is not a five-star book. It is not a typical Packt book, in that Packt publishes IT books, and this is a formula-ridden statistics book that would be more at home in the catalog of an academic publisher like Springer or CRC-Hall. It starts with a concise if dry survey of Bayesian basics, and then surveys several Bayesian methods, implemented by specific R packages - "arm", "BayesLogit", "lda", "brnn", "bgmm", "darch". I see good things about the first part; the second one, on the other hand, came across as not very clearly written and too superficial to be useful: for example, I simply failed to understand the author's explanation of the Bayesian logit, and that should not have been complicated. The decision to go with specific R packages is understandable, but the failure to even mention BUGS, JAGS or STAN - the popular, general tools - is not.I don't understand who the book is for. The people who can handle the integrals, so to speak, will find the relevant R packages on their own. The less technical readers, on the other hand, will be put off by the academic style. My suggestion to the latter is "Doing Bayesian Data Analysis" by Kruschke. There is also a good, accessible book which uses Python rather than R, by Davidson-Pilon.UPD. With the benefit of a little more life experience, I would say: don't spend your time on *any* R book. Python is the way to go.
Amazon Verified review Amazon
Vincent Jun 06, 2016
Full star icon Full star icon Full star icon Empty star icon Empty star icon 3
The book provides a quick review of all the main things you need to know when running Bayesian analyses in R. It will not make you a Bayesian wizard, but it could serve as a quick introduction to Bayesian analyses in R.A point of critic is that at some places I felt that the author could have provided a bit more guidance on interpretation. For example, chapter 5 explains how to fit a Bayesian model and how to simulate the posterior distribution, but does not devote a single line to explain how a user should interpret and use that simulation compared with the model coefficients. Further, chapter 5 states that smaller confidence intervals in the Bayesian model is a major benefit, but when I compare the actual predictions with reference values the Bayesian model actually performs marginally worse than ordinary least square regression. The author should really do more effort to explain why smaller confidence intervals are worth the reduction in actual model quality.Code examples are simply a log of the command line entries the author, including odd repetitions. Further, the code has some poor programming habits, e.g. using the attach(data) function is not meaningful if you already pass the data argument to the model.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is included in a Packt subscription? Chevron down icon Chevron up icon

A subscription provides you with full access to view all Packt and licnesed content online, this includes exclusive access to Early Access titles. Depending on the tier chosen you can also earn credits and discounts to use for owning content

How can I cancel my subscription? Chevron down icon Chevron up icon

To cancel your subscription with us simply go to the account page - found in the top right of the page or at https://subscription.packtpub.com/my-account/subscription - From here you will see the ‘cancel subscription’ button in the grey box with your subscription information in.

What are credits? Chevron down icon Chevron up icon

Credits can be earned from reading 40 section of any title within the payment cycle - a month starting from the day of subscription payment. You also earn a Credit every month if you subscribe to our annual or 18 month plans. Credits can be used to buy books DRM free, the same way that you would pay for a book. Your credits can be found in the subscription homepage - subscription.packtpub.com - clicking on ‘the my’ library dropdown and selecting ‘credits’.

What happens if an Early Access Course is cancelled? Chevron down icon Chevron up icon

Projects are rarely cancelled, but sometimes it's unavoidable. If an Early Access course is cancelled or excessively delayed, you can exchange your purchase for another course. For further details, please contact us here.

Where can I send feedback about an Early Access title? Chevron down icon Chevron up icon

If you have any feedback about the product you're reading, or Early Access in general, then please fill out a contact form here and we'll make sure the feedback gets to the right team. 

Can I download the code files for Early Access titles? Chevron down icon Chevron up icon

We try to ensure that all books in Early Access have code available to use, download, and fork on GitHub. This helps us be more agile in the development of the book, and helps keep the often changing code base of new versions and new technologies as up to date as possible. Unfortunately, however, there will be rare cases when it is not possible for us to have downloadable code samples available until publication.

When we publish the book, the code files will also be available to download from the Packt website.

How accurate is the publication date? Chevron down icon Chevron up icon

The publication date is as accurate as we can be at any point in the project. Unfortunately, delays can happen. Often those delays are out of our control, such as changes to the technology code base or delays in the tech release. We do our best to give you an accurate estimate of the publication date at any given time, and as more chapters are delivered, the more accurate the delivery date will become.

How will I know when new chapters are ready? Chevron down icon Chevron up icon

We'll let you know every time there has been an update to a course that you've bought in Early Access. You'll get an email to let you know there has been a new chapter, or a change to a previous chapter. The new chapters are automatically added to your account, so you can also check back there any time you're ready and download or read them online.

I am a Packt subscriber, do I get Early Access? Chevron down icon Chevron up icon

Yes, all Early Access content is fully available through your subscription. You will need to have a paid for or active trial subscription in order to access all titles.

How is Early Access delivered? Chevron down icon Chevron up icon

Early Access is currently only available as a PDF or through our online reader. As we make changes or add new chapters, the files in your Packt account will be updated so you can download them again or view them online immediately.

How do I buy Early Access content? Chevron down icon Chevron up icon

Early Access is a way of us getting our content to you quicker, but the method of buying the Early Access course is still the same. Just find the course you want to buy, go through the check-out steps, and you’ll get a confirmation email from us with information and a link to the relevant Early Access courses.

What is Early Access? Chevron down icon Chevron up icon

Keeping up to date with the latest technology is difficult; new versions, new frameworks, new techniques. This feature gives you a head-start to our content, as it's being created. With Early Access you'll receive each chapter as it's written, and get regular updates throughout the product's development, as well as the final course as soon as it's ready.We created Early Access as a means of giving you the information you need, as soon as it's available. As we go through the process of developing a course, 99% of it can be ready but we can't publish until that last 1% falls in to place. Early Access helps to unlock the potential of our content early, to help you start your learning when you need it most. You not only get access to every chapter as it's delivered, edited, and updated, but you'll also get the finalized, DRM-free product to download in any format you want when it's published. As a member of Packt, you'll also be eligible for our exclusive offers, including a free course every day, and discounts on new and popular titles.