Databases for gene variants are very useful for sharing genetic data and to facilitate the understanding of the genetic basis of diseases. This report summarises the issues surrounding the development of the Malaysian Human Variome Project Country Node. The focus is on human germline variants. Somatic variants, mitochondrial variants and other types of genetic variation have corresponding databases which are not covered here, as they have specific issues that do not necessarily apply to germline variations.
Rapid technological advancement in high-throughput genomics, microarray, and deep sequencing technologies has accelerated the possibility of more complex precision medicine research using large amounts of heterogeneous health-related data from patients, including genomic variants. Genomic variants can be identified and annotated based on the reference human genome either within the sequence as a whole or in a putative functional genomic element. The American College of Medical Genetics and Genomics (ACMG) and the Association for Molecular Pathology (AMP) mutually created standards and guidelines for the appraisal of proof to expand consistency and straightforwardness in clinical variation interpretations. Various efforts toward precision medicine have been facilitated by many national and international public databases that classify and annotate genomic variation. In the present study, several resources are highlighted with recognition and data spreading of clinically important genetic variations.
Haemoglobinopathies are the commonest monogenic diseases worldwide and are caused by variants in the globin gene clusters. With over 2400 variants detected to date, their interpretation using the American College of Medical Genetics and Genomics (ACMG)/Association for Molecular Pathology (AMP) guidelines is challenging and computational evidence can provide valuable input about their functional annotation. While many in silico predictors have already been developed, their performance varies for different genes and diseases. In this study, we evaluate 31 in silico predictors using a dataset of 1627 variants in HBA1, HBA2, and HBB. By varying the decision threshold for each tool, we analyse their performance (a) as binary classifiers of pathogenicity and (b) by using different non-overlapping pathogenic and benign thresholds for their optimal use in the ACMG/AMP framework. Our results show that CADD, Eigen-PC, and REVEL are the overall top performers, with the former reaching moderate strength level for pathogenic prediction. Eigen-PC and REVEL achieve the highest accuracies for missense variants, while CADD is also a reliable predictor of non-missense variants. Moreover, SpliceAI is the top performing splicing predictor, reaching strong level of evidence, while GERP++ and phyloP are the most accurate conservation tools. This study provides evidence about the optimal use of computational tools in globin gene clusters under the ACMG/AMP framework.
Accurate and consistent interpretation of sequence variants is integral to the delivery of safe and reliable diagnostic genetic services. To standardize the interpretation process, in 2015, the American College of Medical Genetics and Genomics (ACMG) and the Association for Molecular Pathology (AMP) published a joint guideline based on a set of shared standards for the classification of variants in Mendelian diseases. The generality of these standards and their subjective interpretation between laboratories has prompted efforts to reduce discordance of variant classifications, with a focus on the expert specification of the ACMG/AMP guidelines for individual genes or diseases. Herein, we describe our experience as a ClinGen Variant Curation Expert Panel to adapt the ACMG/AMP criteria for the classification of variants in three globin genes (HBB, HBA2, and HBA1) related to recessively inherited hemoglobinopathies, including five evidence categories, as use cases demonstrating the process of specification and the underlying rationale.
Thalassemia is one of the most prevalent monogenic disorders in low- and middle-income countries (LMICs). There are an estimated 270 million carriers of hemoglobinopathies (abnormal hemoglobins and/or thalassemia) worldwide, necessitating global methods and solutions for effective and optimal therapy. LMICs are disproportionately impacted by thalassemia, and due to disparities in genomics awareness and diagnostic resources, certain LMICs lag behind high-income countries (HICs). This spurred the establishment of the Global Globin Network (GGN) in 2015 at UNESCO, Paris, as a project-wide endeavor within the Human Variome Project (HVP). Primarily aimed at enhancing thalassemia clinical services, research, and genomic diagnostic capabilities with a focus on LMIC needs, GGN aims to foster data collection in a shared database by all affected nations, thus improving data sharing and thalassemia management. In this paper, we propose a minimum requirement for establishing a genomic database in thalassemia based on the HVP database guidelines. We suggest using an existing platform recommended by HVP, the Leiden Open Variation Database (LOVD) (https://www.lovd.nl/). Adoption of our proposed criteria will assist in improving or supplementing the existing databases, allowing for better-quality services for individuals with thalassemia. Database URL: https://www.lovd.nl/.
The Malaysian Node of the Human Variome Project (MyHVP) is one of the eighteen official Human Variome Project (HVP) country-specific nodes. Since its inception in 9(th) October 2010, MyHVP has attracted the significant number of Malaysian clinicians and researchers to participate and contribute their data to this project. MyHVP also act as the center of coordination for genotypic and phenotypic variation studies of the Malaysian population. A specialized database was developed to store and manage the data based on genetic variations which also associated with health and disease of Malaysian ethnic groups. This ethnic-specific database is called the Malaysian Node of the Human Variome Project database (MyHVPDb).