Regularization in logistic regression
One of the dangers of machine learning is over-fitting: the algorithm captures not only the signal in the training set, but also the statistical noise that results from the finite size of the training set.
A way to mitigate over-fitting in logistic regression is to use regularization: we impose a penalty for large values of the parameters when optimizing. We can do this by adding a penalty to the cost function that is proportional to the magnitude of the parameters. Formally, we re-write the logistic regression cost function (described in Chapter 2, Manipulating Data with Breeze) as:
![Regularization in logistic regression](https://static.packt-cdn.com/products/9781785281372/graphics/graphics/4795_12_06.jpg)
where is the normal logistic regression cost function:
![Regularization in logistic regression](https://static.packt-cdn.com/products/9781785281372/graphics/graphics/4795_12_08.jpg)
Here, params is the vector of parameters, is the vector of features for the ith training example, and
is 1 if the i
th training example is spam, and 0 otherwise. This is identical to the logistic regression cost-function introduced in Chapter 2, Manipulating data with Breeze, apart from the addition of the regularization...