This recipe evaluates the accuracy of the Centrifuge classification for selected markers.
The recipe generates simulated sequencing reads from NCBI accession numbers then performs classification with Centrifuge
In the third step evaluates and reports the accuracy (the percent of reads were classified correctly) of the classification.
The input to the recipe is a list of NCBI accession numbers:
The main output is a file that lists the classification accuracy for each accession number:
accession expect actual percent taxid title
AP012081 1000 993 99.3 67547 Margariscus margarita mitochondrial DNA, almost complete genome, except for D-loop
AP011279 1000 996 99.6 90988 Pimephales promelas mitochondrial DNA, complete genome
DQ288268 1000 996 99.6 8022 Oncorhynchus mykiss mitochondrion, complete genome
AP014537 1000 1000 100.0 27706 Micropterus salmoides mitochondrial DNA, almost complete genome