This recipe evaluates the accuracy of the Centrifuge classification for selected markers.
The recipe generates simulated sequencing reads from NCBI accession numbers then performs classification with Centrifuge
In the third step evaluates and reports the accuracy (the percent of reads were classified correctly) of the classification.
The input to the recipe is a list of NCBI accession numbers:
AP012081 AP011279 DQ288268 AP014537
The main output is a file that lists the classification accuracy for each accession number:
accession expect actual percent taxid title AP012081 1000 993 99.3 67547 Margariscus margarita mitochondrial DNA, almost complete genome, except for D-loop AP011279 1000 996 99.6 90988 Pimephales promelas mitochondrial DNA, complete genome DQ288268 1000 996 99.6 8022 Oncorhynchus mykiss mitochondrion, complete genome AP014537 1000 1000 100.0 27706 Micropterus salmoides mitochondrial DNA, almost complete genome