A genetic programming example
CnvKit is also available, but it is a CLI and not easy to use. In addition to this, PyCogent, which was developed by researchers at NCBI from the National Institutes of Health (NIH), is a useful tool. However, they are not easy to use. We will use a package called Bio
(https://github.com/biopython/biopython/tree/master/Bio) and libraries from Python programming for biology.
In general, every experiment, research project, or study has sequence as the key object that is used in bioinformatics. As a mathematician, my visual thought of a sequence relates to a string with certain patterns (such as ATAGCATATGCT
). To begin with, here is a simple example that shows a sequence, GC ratio, and codons:
from Bio.Seq import Seq from Bio.Alphabet import IUPAC from Bio.SeqUtils import GC def DNACodons(seq): end = len(seq) - (len(seq) % 3) – 1 codons = [seq[i:i+3] for i in range(0, end, 3)] return codons DNACodons(my_seq) my_seq = Seq('GGTCGATGGGCCTAGCAGCATATCTGAGC...