Filtering Sequence Alignment Maps (SAM/BAM) files.
Typically the BAM file contains all the information necessary to interpret sequencing data. For many studies, the ability to extract and isolate certain alignments using various criteria is essential. The SAM flag system, though quirky and unwieldy may be used to accomplish many of the required stages.
This recipe provides the code necessary to investigate alignment files obtained from aligning viral sequence data from the 2014 Ebola viral outbreak against the Mayinga strain observed in 1972.
The recipe operates as a statistics report that summarizes various types of alignments.
- Downloads the 1976 Ebola - Mayinga Reference Genome
- Downloads sequencing data for the 2014 outbreak
- Generates sequence alignments of the 2014 outbreak relative to the 1976 outbreak
- Investigates the resulting data and reports the number of alignments by various criteria
- Demonstrates how more complex queries could be formulated
A detailed presentation that explains the steps and rationale for this recipe can be found at:
Please refer to the lecture above for study materials and additional content.