Libraries for biostatistics hypothesis tests in Python
One of the most important ways to gain insights from biological data is to use hypothesis tests to see if the output we are seeing is statistically relevant or not. This principle is called testing for statistical significance. Statistical significance is a term commonly linked to the p-value, which represents the probability of obtaining test results that are as extreme as or more extreme than the observed result. So, the lower the p-value, the more credible the result.
The underlying principles of p-values
We can visualize the probability aspects of p-values on a Gaussian curve as follows:
Figure 5.1 – Normal Gaussian curve with one-tailed alpha
The shaded region in the area under the curve (AUC) represents the probability that the observed results are less extreme than what is shown. This means the blue area is actually an area of uncertainty. If the blue area is less than 5% of the...