Predicting recommendations for movies and jokes
In this chapter, we will focus on building recommender systems using two different data sets. To do this, we shall use the recommenderlab
package. This provides us with not only the algorithms to perform the recommendations, but also with the data structures to store the sparse rating matrices efficiently. The first data set we will use contains anonymous user reviews for jokes from the Jester Online Joke recommender system.
The joke ratings fall on a continuous scale (-10 to +10). A number of data sets collected from the Jester system can be found at http://eigentaste.berkeley.edu/dataset/. We will use the data set labeled on the website as Dataset 2+. This data set contains ratings made by 50,692 users on 150 jokes. As is typical with a real-world application, the rating matrix is very sparse in that each user rated only a fraction of all the jokes; the minimum number of ratings made by a user is 8. We will refer to this data set as the jester...