Using logistic regression to derive odds ratios for categorical variables
OK, now, let’s make a logistic regression model for research question number 2: Are ST depression level, age, and cholesterol associated with the presence of CAD?
For this model, we will use the CAD status coded as 1
and 0
or yes
/no
variables. We will set ST depression, age, and serum cholesterol as the independent variables:
Figure 9.10 – CAD status prediction model with added serum cholesterol
This time, we will be using the binary logistic regression, because the dependent variable is a categorical binary variable coded as 1
and 0
. Let’s see the code for this:
import pandas as pd import statsmodels.api as sm import seaborn as sns import matplotlib.pyplot as plt import numpy as np # Define the formula for logistic regression formula = 'cad ~ oldpeak + age + chol' # Create the logistic regression model logit_model = sm.Logit.from_formula(formula...