METHODS AND RESULTS: In view of the lack of study on their mitogenome, we sequenced (by next generation sequencing) and annotated the complete mitogenome of D. vijaysegarani from Malaysia to determine its features and phylogenetic relationship. The whole mitogenome of D. vijaysegarani has identical gene order with the published mitogenomes of the genus Dacus, with 13 protein-coding genes, two rRNA genes, 22 tRNAs, a non-coding A + T rich control region, and intergenic spacer and overlap sequences. Phylogenetic analysis based on 15 mitochondrial genes (13 PCGs and two rRNA genes), reveals Dacus, Zeugodacus and Bactrocera forming a distinct clade. The genus Dacus forms a monophyletic group in the subclade containing also the Zeugodacus group; this Dacus-Zeugodacus subclade is distinct from the Bactrocera subclade. D. (Mellesis) vijaysegarani forms a lineage with D. (Mellesis) trimacula in the subcluster containing also the lineage of D. (Mellesis) conopsoides and D. (Callantra) longicornis. D. (Dacus) bivittatus and D. (Didacus) ciliatus form a distinct subcluster. Based on cox1 sequences, the Malaysia and Vietnam taxa of D. vijaysegarani may not be conspecific.
CONCLUSIONS: Overall, the mitochondrial genome of D. vijaysegarani provided essential molecular data that could be useful for further studies for species diagnosis, evolution and phylogeny research of other tephritid fruit flies in the future.
RESULTS: As part of the Vertebrate Genomes Project (VGP) we develop mitoVGP, a fully automated pipeline for similarity-based identification of mitochondrial reads and de novo assembly of mitochondrial genomes that incorporates both long (> 10 kbp, PacBio or Nanopore) and short (100-300 bp, Illumina) reads. Our pipeline leads to successful complete mitogenome assemblies of 100 vertebrate species of the VGP. We observe that tissue type and library size selection have considerable impact on mitogenome sequencing and assembly. Comparing our assemblies to purportedly complete reference mitogenomes based on short-read sequencing, we identify errors, missing sequences, and incomplete genes in those references, particularly in repetitive regions. Our assemblies also identify novel gene region duplications. The presence of repeats and duplications in over half of the species herein assembled indicates that their occurrence is a principle of mitochondrial structure rather than an exception, shedding new light on mitochondrial genome evolution and organization.
CONCLUSIONS: Our results indicate that even in the "simple" case of vertebrate mitogenomes the completeness of many currently available reference sequences can be further improved, and caution should be exercised before claiming the complete assembly of a mitogenome, particularly from short reads alone.