Exploring a dataset with Bio.PopGen
In this chapter, we will perform an initial exploratory analysis of one of our generated datasets. We will analyze the 10 percent sampling of chromosome 2
without the offspring. We will look for monomorphic loci (in this case, SNPs) across populations along with how to research minimum allele frequencies and expected heterozygosities.
Getting ready
You will need to have run the previous two recipes and should have the hapmap10_auto_noofs_2.gp
and hapmap10_auto_noofs_2.pops
files. We will also use the metadata file downloaded in the first recipe.
For this code to work, you will need to install Genepop from http://kimura.univ-montp2.fr/~rousset/Genepop.htm. We will use the interface provided by Biopython to execute Genepop and parse its output files.
There is a notebook with this recipe: 03_PopGen/Exploratory_Analysis.ipynb
, but it will still require running the previous two notebooks in order for the required files to be generated.
How to do it…
Take...