Summary
In this chapter, we covered MLflow and its integration with the feature management data layer of our reference architecture. We leveraged the features of the MLflow Projects module to structure our data pipeline.
The important layer of data and feature management was introduced, and the need for feature generation was made clear, as were the concepts of data quality, validation, and data preparation.
We applied the different stages of producing a data pipeline to our own project. We then formalized data acquisition and quality checks. In the last section, we introduced the concept of a feature store and how to create and use one.
In the next chapters and following section of the book, we will focus on applying the data pipeline and features to the process of training and deploying the data pipeline in production.