Summary
In this chapter, we have seen the implementation of rules using Snorkel labeling functions for predicting the income range and labeling functions using the Compose library to predict the total amount spent by a customer during a given period. We have learned how semi-supervised learning can be used to generate pseudo-labels and data augmentation. We also learned how K-means clustering can be used to cluster the income features and then predict the income for each cluster based on business knowledge.
In the next chapter, we are going to learn how we can label data for regression using the Snorkel Python library, semi-supervised learning, and K-means clustering. Let us explore that in the next chapter.