Typically the BAM file contains all the information necessary to interpret sequencing data. For many studies, the ability to extract and isolate certain alignments using various criteria is essential. The SAM flag system, though quirky and unwieldy may be used to accomplish many of the required stages.
This recipe provides the code necessary to investigate alignment files obtained from aligning viral sequence data from the 2014 Ebola viral outbreak against the Mayinga strain observed in 1972.
The recipe operates as a statistics report that summarizes various types of alignments.
Downloads the 1976 Ebola - Mayinga Reference Genome
Downloads sequencing data for the 2014 outbreak
Generates sequence alignments of the 2014 outbreak relative to the 1976 outbreak
Investigates the resulting data and reports the number of alignments by various criteria
Demonstrates how more complex queries could be formulated
A detailed presentation that explains the steps and rationale for this recipe can be found at: