An overview of statistical modeling
In order to explore the relationship between data and a set of experimental conditions, we often rely on statistical modeling. One of the central purposes of R is to estimate the fit of your data to a variety of models that you can easily optimize using several built-in functions and arguments. Although picking the best model to represent your data can be overwhelming, it is important to remember the principle of parsimony when choosing a model. Essentially, you should only include an explanatory variable in a model if it significantly improves the fit of a model. Therefore, our ideal model will try and fulfill most of the criteria in this list:
- Contain n-1 parameters instead of n parameters
- Contain k-1 explanatory variables instead of k variables
- Be linear instead of curved
- Not contain interactions between factors
In other words, we can simplify our model by removing non-significant interaction terms and explanatory variables, and by grouping together factor...