Bacterial variant caller

Bacterial variant caller

This recipe contains the code presented by Torsten Seemann in the blog post titled A Unix one-liner to call bacterial variants

The code from the above website has been generalized a bit with additional utility.

This recipe provides code that:

  1. Downloads a reference genome (default value: NZ_CP008918 Pasteurella multocida strain ATCC 43137, complete genome)
  2. Downloads a sequencing run from SRA (default value: SRR4124989)
  3. Generates statistics on the sequencing data
  4. Runs minimap2 to align the sequences and create a SAM output
  5. Runs bcftools mpileup to generate the genotype likelihoods of each base
  6. Runs bcftools call to filter for multiallelic variants only
  7. Runs bcftools norm to normalize each variant to a standard form
  8. Runs bcftools filter to remove variants with low quality (QUAL) or low coverage (DP)
