Recipe View

FASTQ data quality control

Perform data quality control on FASTQ data

3 results • updated 5.6 years ago by Istvan Albert

Perform data quality control on FASTQ data

This recipe serves as the introductory recipe for the course and this entire site.

The recipe code will demonstrate the following:

Downloading sequencing data from SRA
Generating a FASTQC report on this data
Trimming Illumina adapters from the dataset in paired-end mode
Generating a FASTQC report on the trimmed data

Lectures

A detailed presentation that explains the steps and rationale of the site and this recipe can be read at:

Please refer to the lectures above for links to the chapters that cover each concept.

Copy recipe

You need write access to the project to edit.

# This recipe downloads sequencing data from SRA
# then performs quality filtering and adapter trimming.

# This is how the recipe gets the SRR 
# variable filled via the website.
SRA=SRR519926

# Stop the script on errors.
set -ue

# How many sequences to unpack.
N=10000

# Create directory to store the reads in.
mkdir -p reads

# Download 1000 reads from SRA.
fastq-dump --split-files -X $N -O reads  $SRA

# Make a directory for the fastqc reports
mkdir -p reports

# Run the fastqc report on the sra reads.
fastqc reads/*.fastq -o reports

# Create the adapter sequence used for trimming.
echo ">illumina" > adapter.fa
echo "AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC" >> adapter.fa

# Run trimmomatic on the data.
trimmomatic PE reads/${SRA}_1.fastq reads/${SRA}_2.fastq -baseout reads/${SRA}.trimmed.fq  ILLUMINACLIP:adapter.fa:2:30:5 SLIDINGWINDOW:4:20

# Run trimmomatic on the trimmed data.
fastqc reads/*.fq -o reports

# Delete the fastqc zip files to reduce clutter.
rm -f reports/*.zip

You need write access to the original recipe to edit.

Click the buttons on the right to create new fields.

Add text field Add float field Add data field Add checkbox Add dropdown Add upload field Add integer field Add radio button

Edit the content of each interface element.

[sra]
label = "SRA Run Number"
display = "TEXTBOX"
value = "SRR519926"
help = "A SRR Run number"

[settings]
name = "FASTQ data quality control"
template = "FASTQ_data_quality_control_cookbook_36.sh"
image = "FASTQ_data_quality_control_cookbook_36.png"
id = 36
recipe_uid = "fastq01"
uid = "fastq01"
help = "Perform data quality control on FASTQ data\n\nThis recipe serves as the introductory recipe for the course and this entire site. \n\nThe recipe code will demonstrate the following:\n\n1. Downloading sequencing data from SRA\n2. Generating a FASTQC report on this data\n3. Trimming Illumina adapters from the dataset in paired-end mode\n4. Generating a FASTQC report on the trimmed data\n\n#### Lectures\n\nA detailed presentation that explains the steps and rationale of the site and this recipe can be read at:\n\n* [Lecture 1: What is a Recipe?][lecture1]\n* [Lecture on Recipe 2: FASTQ data quality control][lecture2]\n\nPlease refer to the lectures above for links to the chapters that cover each concept.\n\n[lecture1]: https://www.biostarhandbook.com/edu/lecture/view/40/\n[lecture2]: https://www.biostarhandbook.com/edu/lecture/view/41/"
url = "http://localhost8000"

You need write access to the original recipe to edit.

Name

Recipe display name

Identifier

Unique identifier for the recipe.

Image :

Optional image for the recipe ( 500px Maximum ).

Rank:

Used to order recipes (optional).

Insert Image

From the web

From your computer

Cancel

Back