Using Python to compare ratings
In the previous examples we used R to work through data frames that were built from converted JSON to CSV files. If we were to use the Yelp businesses rating file we could use Python directly, as it is much smaller and produces similar results.
In this example, we gather cuisines from the Yelp file based on whether the business category includes restaurants. We accumulate the ratings for all cuisines and then produce averages for each.
We read in the JSON file into separate lines and convert each line into a Python object:
Note
We convert each line to Unicode with the errors=ignore
option. This is due to many erroneous characters present in the data file.
import json#filein = 'c:/Users/Dan/business.json'filein = 'c:/Users/Dan/yelp_academic_dataset_business.json'lines = list(open(filein))
We use a dictionary for the ratings for a cuisine. The key of the dictionary is the name of the cuisine. The value of the dictionary is a list of ratings for that cuisine:
ratings...