Doing Automated EDA using pandas profiling
pandas
profiling
is a popular Automated EDA library that generates EDA reports from a dataset stored in a pandas dataframe. With a line of code, the library can generate a detailed report, which covers critical information such as summary statistics, distribution of variables, correlation/interaction between variables, and missing values. The library is useful for quickly and easily generating insights from large datasets because it requires minimal effort from its users. The output is presented in an interactive HTML report, which can easily be customized.
The Automated EDA report generated by pandas profiling contains the following sections:
- Overview: This section provides a general summary of the dataset. It includes the number of observations, the number of variables, missing values, duplicate rows, and more.
- Variables: This section provides information about the variables in the dataset. It includes summary statistics ...