Designing architecture for genomics
In this section, we will describe a sample reference architecture for transferring, storing, processing, and gaining insights on genomics datasets on the AWS cloud, in a secure and cost-effective manner. Figure 12.1 shows a sample genomics data processing workflow:
Figure 12.1 – Genomics data processing workflow
Figure 12.1 shows the following workflow:
- A scientist or a lab technician will collect sample genomic data, for example, skin cells, prepare it in a lab, and then load it into a sequencer.
- The sequencer will then generate a sequence, which might be short DNA fragments. These are usually called reads because you are reading DNA.
- The DNA sequence is stored in an on-premises data storage system.
- The AWS DataSync service will then transfer the genomic data securely to the cloud; for further details, refer to Chapter 2, Data Management and Transfer.
- The raw genomic data is then stored...