Introduction
All the datasets used in the preceding chapters have not had much or any work done to change their structure. We immediately began processing the datasets in their original shape. Many datasets in the wild will need a significant amount of restructuring before commencing a more detailed analysis. In some cases, an entire project might only concern itself with formatting the data in such a way that it can be easily processed by someone else.
There are many terms that are used to describe the process of data restructuring, with tidy data being the most common to data scientists. Tidy data is a term coined by Hadley Wickham to describe a form of data that makes analysis easy to do. This chapter will cover many ideas formulated by Hadley and how to accomplish them with pandas. To learn a great deal more about tidy data, read Hadley's paper (http://vita.had.co.nz/papers/tidy-data.pdf).
The following is an example of untidy data: