Bioinformatics Recipe Cookbook
All Projects

Half-sequence and half mythical-beast, "unaligned" BAM files are used to store FASTQ files.

SAM/BAM are alignment formats, thus it feels quite anachronistic to use them to store "unaligned" sequences.

On the other hand BAM files have quite a few advantages over FASTQ:

  1. Are compressed,
  2. Line oriented (all information on the sequence is on a single line),
  3. Can store sample information via tags,
  4. There are many tools that can operate on BAM files (extract by tags, filter by tags, etc)

In addition, the BAM format also stores not just the alignments but the entire reads sequences. To take advantage of the previously listed features, some bioinformaticians began storing their original, raw data in a so-called "unaligned" BAM. Thus we have "unaligned" reads in an "alignment format".

This recipe demonstrates code that will:

  1. Accesses an NCBI BioProject with a given accession number.
  2. Downloads 5 sequencing runs from the project.
  3. Transforms each downloaded FASTQ file into a BAM file while tagging the reads from that file with the SRR number tag.
  4. Merges the resulting BAM files into a single BAM that now contains all sequencing data for the project in just one file.
  5. Finally the recipe demonstrates the code needed to revert the process of extracting the original data from an unaligned BAM
Recipe Code
Back
Recipe Interface
Back
Interface Preview
Interface preview shows the resulting view of the combined interface elements.
Interface specification
Interface element specification. Shows the code and the resulting interface element.
Double click on an element to insert it into the interface.
Integer values
                        [size]
                        label = "Window size"
                        display = "INTEGER"
                        value = 100
                        range = [1, 100]
                        help = "Selects the smoothing window."
                    
Float values
                        [cutoff]
                        label = "P_Value Cutoff"
                        display = "FLOAT"
                        value = 0.05
                        range = [0, 1]
                        help = "Selects the cutoff."
                    
Text box
                    [sra]
                    label = "Run Number"
                    display = "TEXTBOX"
                    value = "SRR519926"
                    regex = 'SRR\d+'
                    help = "Please provide SRR Run number"
                    
Dropdown menu
                    [color]
                    label = "Select color"
                    display = "DROPDOWN"
                    choices = [ ["R","Red"], ["B","Blue"] ]
                    value = "R"
                    help = "Select a color of your choice"
                    
Check box
                    [validate]
                    label = "Cross Validate"
                    display = "CHECKBOX"
                    value = true
                    help = "Apply cross validation on the results."
                    
Upload Field
                    [file]
                    label = "Upload a file "
                    display = "UPLOAD"
                    help = "Upload a file to analyze"
                    
Pick Data
                    [data]
                    label = "Pick data"
                    source = "PROJECT"
                    help = "Pick data from this project."
                    
Radio Buttons
                    [species]
                    label = "Select species"
                    display = "RADIO"
                    choices = [ ["c","Cat"], ["d","Dog"] ]
                    value = "c"
                    help = "Select the species."
                    
Edit Recipe
Recipe display name
Unique identifier for the recipe.

Determines who can run a recipe.

A detailed explanation of what the recipe does (markdown OK).
Image :
Optional image for the recipe ( 500px Maximum ).
Rank:
Used to order recipes (optional).
Back

Powered by the release 2.1