Affiliations 

  • 1 Centre for Bioinformatics, School of Data Sciences, Perdana University, Jalan MAEPS Perdana, Serdang, Selangor Darul Ehsan, 43400, Malaysia. asif@perdanauniversity.edu.my
  • 2 Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, 8 Medical Drive, Singapore, 117597, Singapore
  • 3 Centre for Genomics and Global Health, University of Oxford, Oxford, UK
  • 4 Centre for Bioinformatics, School of Data Sciences, Perdana University, Jalan MAEPS Perdana, Serdang, Selangor Darul Ehsan, 43400, Malaysia
  • 5 Menzies Health Institute Queensland, Griffith University, Parklands Dr, Southport, 4215, QLD, Australia
  • 6 Department of Pharmacology and Molecular Sciences, The Johns Hopkins University School of Medicine, 725 North Wolfe Street, Baltimore, MD, 21205, USA
BMC Med Genomics, 2017 12 21;10(Suppl 4):78.
PMID: 29322922 DOI: 10.1186/s12920-017-0301-2

Abstract

BACKGROUND: Viral vaccine target discovery requires understanding the diversity of both the virus and the human immune system. The readily available and rapidly growing pool of viral sequence data in the public domain enable the identification and characterization of immune targets relevant to adaptive immunity. A systematic bioinformatics approach is necessary to facilitate the analysis of such large datasets for selection of potential candidate vaccine targets.

RESULTS: This work describes a computational methodology to achieve this analysis, with data of dengue, West Nile, hepatitis A, HIV-1, and influenza A viruses as examples. Our methodology has been implemented as an analytical pipeline that brings significant advancement to the field of reverse vaccinology, enabling systematic screening of known sequence data in nature for identification of vaccine targets. This includes key steps (i) comprehensive and extensive collection of sequence data of viral proteomes (the virome), (ii) data cleaning, (iii) large-scale sequence alignments, (iv) peptide entropy analysis, (v) intra- and inter-species variation analysis of conserved sequences, including human homology analysis, and (vi) functional and immunological relevance analysis.

CONCLUSION: These steps are combined into the pipeline ensuring that a more refined process, as compared to a simple evolutionary conservation analysis, will facilitate a better selection of vaccine targets and their prioritization for subsequent experimental validation.

* Title and MeSH Headings from MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine.