Recipe View

How to align simulated sequencing data

This recipe aligns simulated reads with a short read aligner.

1 result • updated 5.3 years ago by Istvan Albert

This recipe aligns simulated reads with a short read aligner.

For more information see the

Bioinformatics Data Analysis online course.
Biostar Handbook for system setup and other information.

Copy recipe

You need write access to the project to edit.

#
# This recipe simulates sequencing data from an accession number.
#
set -uex

# Accession number for the reference.
ACC=AF086833

# How many reads to generate.
N=10000

# Make the genome directory
mkdir -p refs

# This will be the reference genome.
REF=refs/$ACC.fa

# The resulting alignment file.
BAM=align.bam

# Obtain the reference genome.
efetch -db nuccore -format fasta -id $ACC > $REF

# Index reference genome for BWA.
bwa index $REF 2>> runlog.txt

# Index the reference genome for IGV.
samtools faidx $REF

# Generate the simulated reads.
# The mutations will stored in the mutations.txt file.
wgsim -N $N $REF r1.fq r2.fq > mutations.txt 2>> runlog.txt

# Read pair names that are the result of the simulation
R1=r1.fq
R2=r2.fq

# Align the reads and generate the BAM file.
bwa mem $REF $R1 $R2 2>> runlog.txt | samtools sort > $BAM

# Index the alignment file.
samtools index $BAM

# Generate a flagstat.
samtools flagstat $BAM > flagstat.txt

You need write access to the original recipe to edit.

Click the buttons on the right to create new fields.

Add text field Add float field Add data field Add checkbox Add dropdown Add upload field Add integer field Add radio button

Edit the content of each interface element.

[acc]
label = "Genome accession number"
display = "TEXTBOX"
value = "AF086833"
regex = "^\\w{1,20}$"
help = "Must be an NCBI accession number"

[N]
label = "The number of reads"
display = "INTEGER"
value = 10000
range = [ 1, 100000,]
help = "How many simulated read pairs to generate."

[settings]
id = 72
recipe_uid = "recipe-simulate"
uid = "recipe-simulate"
name = "How to align simulated sequencing data"
template = "How_to_align_simulated_sequencing_data_bio-data-analysis_72.sh"
image = "How_to_align_simulated_sequencing_data_bio-data-analysis_72.png"
project_uid = "2a988f7f"
url = "http://localhost8000"
help = "This recipe aligns simulated reads with a short read aligner.\n\nFor more information see the\n\n* [Bioinformatics Data Analysis][link]  online course.\n* [Biostar Handbook][book] for system setup and other information.\n\n[link]: https://www.biostarhandbook.com/edu/course/4/\n[book]: https://www.biostarhandbook.com/"

You need write access to the original recipe to edit.

Name

Recipe display name

Identifier

Unique identifier for the recipe.

Image :

Optional image for the recipe ( 500px Maximum ).

Rank:

Used to order recipes (optional).

Insert Image

From the web

From your computer

Cancel

Back