Characteristics of Transaction Data
The data used in market basket analysis is transaction data or any type of data that resembles transaction data. In its most basic form, transaction data has some sort of transaction identifier, such as an invoice or transaction number, and a list of products associated with said identifier. It just so happens that these two base elements are all that is needed to perform market basket analysis. However, transaction data rarely – it is probably even safe to say never – comes in this basic form. Transaction data typically includes pricing information, dates and times, and customer identifiers, among many other things. Here is how each product is mapped to multiple invoices:
Figure 8.10: Each available product is going to map back to multiple invoice numbers
Due to the complexity of transaction data, data cleaning is crucial. The goal of data cleaning in the context of market basket analysis is to filter out all...