Performing and visualizing logistic regression in Python
When the association between variables is linear as in the previous examples in this chapter, linear regression was used. However, in this section, we will be analyzing a potentially non-linear association, so we need a different regression method. What is implemented in this specific scenario is a method called binary logistic regression. This type of variable can take different inputs and have a binary outcome (e.g., a Yes and No variable) and the association between them can (but does not need to) have a normal distribution. For this reason, it is more appropriate for binary outcomes.
Using the diabetes dataset, you will learn how to perform analysis with the main research question of checking whether there is an association between BMI and diabetes (CLASS : Y
). You will also use controls for binary comparison (CLASS: N
).
Further in this section, we will be focusing on the implementation of the data visualization methods...