Displaying publications 81 - 100 of 468 in total

Abstract:
Sort:
  1. Angers-Loustau A, Petrillo M, Bengtsson-Palme J, Berendonk T, Blais B, Chan KG, et al.
    F1000Res, 2018;7.
    PMID: 30026930 DOI: 10.12688/f1000research.14509.2
    Next-Generation Sequencing (NGS) technologies are expected to play a crucial role in the surveillance of infectious diseases, with their unprecedented capabilities for the characterisation of genetic information underlying the virulence and antimicrobial resistance (AMR) properties of microorganisms.  In the implementation of any novel technology for regulatory purposes, important considerations such as harmonisation, validation and quality assurance need to be addressed.  NGS technologies pose unique challenges in these regards, in part due to their reliance on bioinformatics for the processing and proper interpretation of the data produced.  Well-designed benchmark resources are thus needed to evaluate, validate and ensure continued quality control over the bioinformatics component of the process.  This concept was explored as part of a workshop on "Next-generation sequencing technologies and antimicrobial resistance" held October 4-5 2017.   Challenges involved in the development of such a benchmark resource, with a specific focus on identifying the molecular determinants of AMR, were identified. For each of the challenges, sets of unsolved questions that will need to be tackled for them to be properly addressed were compiled. These take into consideration the requirement for monitoring of AMR bacteria in humans, animals, food and the environment, which is aligned with the principles of a "One Health" approach.
    Matched MeSH terms: Computational Biology/methods*
  2. Fotoohifiroozabadi S, Mohamad MS, Deris S
    J Bioinform Comput Biol, 2017 Apr;15(2):1750004.
    PMID: 28274174 DOI: 10.1142/S0219720017500044
    Protein structure alignment and comparisons that are based on an alphabetical demonstration of protein structure are more simple to run with faster evaluation processes; thus, their accuracy is not as reliable as three-dimension (3D)-based tools. As a 1D method candidate, TS-AMIR used the alphabetic demonstration of secondary-structure elements (SSE) of proteins and compared the assigned letters to each SSE using the [Formula: see text]-gram method. Although the results were comparable to those obtained via geometrical methods, the SSE length and accuracy of adjacency between SSEs were not considered in the comparison process. Therefore, to obtain further information on accuracy of adjacency between SSE vectors, the new approach of assigning text to vectors was adopted according to the spherical coordinate system in the present study. Moreover, dynamic programming was applied in order to account for the length of SSE vectors. Five common datasets were selected for method evaluation. The first three datasets were small, but difficult to align, and the remaining two datasets were used to compare the capability of the proposed method with that of other methods on a large protein dataset. The results showed that the proposed method, as a text-based alignment approach, obtained results comparable to both 1D and 3D methods. It outperformed 1D methods in terms of accuracy and 3D methods in terms of runtime.
    Matched MeSH terms: Computational Biology/methods*
  3. Wahab HA, Amaro RE, Cournia Z
    J Chem Inf Model, 2018 11 26;58(11):2175-2177.
    PMID: 30277769 DOI: 10.1021/acs.jcim.8b00642
    Matched MeSH terms: Computational Biology*
  4. Baharum SN, Azizan KA
    Adv Exp Med Biol, 2018 11 2;1102:51-68.
    PMID: 30382568 DOI: 10.1007/978-3-319-98758-3_4
    Over the last decade, metabolomics has continued to grow rapidly and is considered a dynamic technology in envisaging and elucidating complex phenotypes in systems biology area. The advantage of metabolomics compared to other omics technologies such as transcriptomics and proteomics is that these later omics only consider the intermediate steps in the central dogma pathway (mRNA and protein expression). Meanwhile, metabolomics reveals the downstream products of gene and expression of proteins. The most frequently used tools are nuclear magnetic resonance (NMR) spectroscopy and mass spectrometry (MS). Some of the common MS-based analyses are gas chromatography-mass spectrometry (GC-MS) and liquid chromatography-mass spectrometry (LC-MS). These high-throughput instruments play an extremely crucial role in discovery metabolomics to generate data needed for further analysis. In this chapter, the concept of metabolomics in the context of systems biology is discussed and provides examples of its application in human disease studies, plant responses towards stress and abiotic resistance and also microbial metabolomics for biotechnology applications. Lastly, a few case studies of metabolomics analysis are also presented, for example, investigation of an aromatic herbal plant, Persicaria minor metabolome and microbial metabolomics for metabolic engineering applications.
    Matched MeSH terms: Systems Biology*
  5. Ramzi AB, Che Me ML, Ruslan US, Baharum SN, Nor Muhammad NA
    PeerJ, 2019;7:e8065.
    PMID: 31879570 DOI: 10.7717/peerj.8065
    Background: G. boninense is a hemibiotrophic fungus that infects oil palms (Elaeis guineensis Jacq.) causing basal stem rot (BSR) disease and consequent massive economic losses to the oil palm industry. The pathogenicity of this white-rot fungus has been associated with cell wall degrading enzymes (CWDEs) released during saprophytic and necrotrophic stage of infection of the oil palm host. However, there is a lack of information available on the essentiality of CWDEs in wood-decaying process and pathogenesis of this oil palm pathogen especially at molecular and genome levels.

    Methods: In this study, comparative genome analysis was carried out using the G. boninense NJ3 genome to identify and characterize carbohydrate-active enzyme (CAZymes) including CWDE in the fungal genome. Augustus pipeline was employed for gene identification in G. boninense NJ3 and the produced protein sequences were analyzed via dbCAN pipeline and PhiBase 4.5 database annotation for CAZymes and plant-host interaction (PHI) gene analysis, respectively. Comparison of CAZymes from G. boninense NJ3 was made against G. lucidum, a well-studied model Ganoderma sp. and five selected pathogenic fungi for CAZymes characterization. Functional annotation of PHI genes was carried out using Web Gene Ontology Annotation Plot (WEGO) and was used for selecting candidate PHI genes related to cell wall degradation of G. boninense NJ3.

    Results: G. boninense was enriched with CAZymes and CWDEs in a similar fashion to G. lucidum that corroborate with the lignocellulolytic abilities of both closely-related fungal strains. The role of polysaccharide and cell wall degrading enzymes in the hemibiotrophic mode of infection of G. boninense was investigated by analyzing the fungal CAZymes with necrotrophic Armillaria solidipes, A. mellea, biotrophic Ustilago maydis, Melampsora larici-populina and hemibiotrophic Moniliophthora perniciosa. Profiles of the selected pathogenic fungi demonstrated that necrotizing pathogens including G. boninense NJ3 exhibited an extensive set of CAZymes as compared to the more CAZymes-limited biotrophic pathogens. Following PHI analysis, several candidate genes including polygalacturonase, endo β-1,3-xylanase, β-glucanase and laccase were identified as potential CWDEs that contribute to the plant host interaction and pathogenesis.

    Discussion: This study employed bioinformatics tools for providing a greater understanding of the biological mechanisms underlying the production of CAZymes in G. boninense NJ3. Identification and profiling of the fungal polysaccharide- and lignocellulosic-degrading enzymes would further facilitate in elucidating the infection mechanisms through the production of CWDEs by G. boninense. Identification of CAZymes and CWDE-related PHI genes in G. boninense would serve as the basis for functional studies of genes associated with the fungal virulence and pathogenicity using systems biology and genetic engineering approaches.

    Matched MeSH terms: Computational Biology; Systems Biology
  6. Yeo JG, Wasser M, Kumar P, Pan L, Poh SL, Ally F, et al.
    Nat Biotechnol, 2020 06;38(6):679-684.
    PMID: 32440006 DOI: 10.1038/s41587-020-0532-1
    Matched MeSH terms: Computational Biology/methods*
  7. Chan KL, Tatarinova TV, Rosli R, Amiruddin N, Azizi N, Halim MAA, et al.
    Biol. Direct, 2017 Sep 08;12(1):21.
    PMID: 28886750 DOI: 10.1186/s13062-017-0191-4
    BACKGROUND: Oil palm is an important source of edible oil. The importance of the crop, as well as its long breeding cycle (10-12 years) has led to the sequencing of its genome in 2013 to pave the way for genomics-guided breeding. Nevertheless, the first set of gene predictions, although useful, had many fragmented genes. Classification and characterization of genes associated with traits of interest, such as those for fatty acid biosynthesis and disease resistance, were also limited. Lipid-, especially fatty acid (FA)-related genes are of particular interest for the oil palm as they specify oil yields and quality. This paper presents the characterization of the oil palm genome using different gene prediction methods and comparative genomics analysis, identification of FA biosynthesis and disease resistance genes, and the development of an annotation database and bioinformatics tools.

    RESULTS: Using two independent gene-prediction pipelines, Fgenesh++ and Seqping, 26,059 oil palm genes with transcriptome and RefSeq support were identified from the oil palm genome. These coding regions of the genome have a characteristic broad distribution of GC3 (fraction of cytosine and guanine in the third position of a codon) with over half the GC3-rich genes (GC3 ≥ 0.75286) being intronless. In comparison, only one-seventh of the oil palm genes identified are intronless. Using comparative genomics analysis, characterization of conserved domains and active sites, and expression analysis, 42 key genes involved in FA biosynthesis in oil palm were identified. For three of them, namely EgFABF, EgFABH and EgFAD3, segmental duplication events were detected. Our analysis also identified 210 candidate resistance genes in six classes, grouped by their protein domain structures.

    CONCLUSIONS: We present an accurate and comprehensive annotation of the oil palm genome, focusing on analysis of important categories of genes (GC3-rich and intronless), as well as those associated with important functions, such as FA biosynthesis and disease resistance. The study demonstrated the advantages of having an integrated approach to gene prediction and developed a computational framework for combining multiple genome annotations. These results, available in the oil palm annotation database ( http://palmxplore.mpob.gov.my ), will provide important resources for studies on the genomes of oil palm and related crops.

    REVIEWERS: This article was reviewed by Alexander Kel, Igor Rogozin, and Vladimir A. Kuznetsov.

    Matched MeSH terms: Computational Biology/methods
  8. Alessandro L, Low KE, Abushelaibi A, Lim SE, Cheng WH, Chang SK, et al.
    Int J Mol Sci, 2022 Nov 18;23(22).
    PMID: 36430761 DOI: 10.3390/ijms232214285
    The diagnosis of endometrial cancer involves sequential, invasive tests to assess the thickness of the endometrium by a transvaginal ultrasound scan. In 6−33% of cases, endometrial biopsy results in inadequate tissue for a conclusive pathological diagnosis and 6% of postmenopausal women with non-diagnostic specimens are later discovered to have severe endometrial lesions. Thus, identifying diagnostic biomarkers could offer a non-invasive diagnosis for community or home-based triage of symptomatic or asymptomatic women. Herein, this study identified high-risk pathogenic nsSNPs in the NRAS gene. The nsSNPs of NRAS were retrieved from the NCBI database. PROVEAN, SIFT, PolyPhen-2, SNPs&GO, PhD-SNP and PANTHER were used to predict the pathogenicity of the nsSNPs. Eleven nsSNPs were identified as “damaging”, and further stability analysis using I-Mutant 2.0 and MutPred 2 indicated eight nsSNPs to cause decreased stability (DDG scores < −0.5). Post-translational modification and protein−protein interactions (PPI) analysis showed putative phosphorylation sites. The PPI network indicated a GFR-MAPK signalling pathway with higher node degrees that were further evaluated for drug targets. The P34L, G12C and Y64D showed significantly lower binding affinity towards GTP than wild-type. Furthermore, the Kaplan−Meier bioinformatics analyses indicated that the NRAS gene deregulation affected the overall survival rate of patients with endometrial cancer, leading to prognostic significance. Findings from this could be considered novel diagnostic and therapeutic markers.
    Matched MeSH terms: Computational Biology/methods
  9. Ealam Selvan M, Lim KS, Teo CH, Lim YY
    J Vis Exp, 2022 Oct 21.
    PMID: 36342167 DOI: 10.3791/64565
    Circular RNAs (circRNAs) are a class of non-coding RNAs that are formed via back-splicing. These circRNAs are predominantly studied for their roles as regulators of various biological processes. Notably, emerging evidence demonstrates that host circRNAs can be differentially expressed (DE) upon infection with pathogens (e.g., influenza and coronaviruses), suggesting a role for circRNAs in regulating host innate immune responses. However, investigations on the role of circRNAs during pathogenic infections are limited by the knowledge and skills required to carry out the necessary bioinformatic analysis to identify DE circRNAs from RNA sequencing (RNA-seq) data. Bioinformatics prediction and identification of circRNAs is crucial before any verification, and functional studies using costly and time-consuming wet-lab techniques. To solve this issue, a step-by-step protocol of in silico prediction and characterization of circRNAs using RNA-seq data is provided in this manuscript. The protocol can be divided into four steps: 1) Prediction and quantification of DE circRNAs via the CIRIquant pipeline; 2) Annotation via circBase and characterization of DE circRNAs; 3) CircRNA-miRNA interaction prediction through Circr pipeline; 4) functional enrichment analysis of circRNA parental genes using Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG). This pipeline will be useful in driving future in vitro and in vivo research to further unravel the role of circRNAs in host-pathogen interactions.
    Matched MeSH terms: Computational Biology/methods
  10. Høie MH, Kiehl EN, Petersen B, Nielsen M, Winther O, Nielsen H, et al.
    Nucleic Acids Res, 2022 Jul 05;50(W1):W510-W515.
    PMID: 35648435 DOI: 10.1093/nar/gkac439
    Recent advances in machine learning and natural language processing have made it possible to profoundly advance our ability to accurately predict protein structures and their functions. While such improvements are significantly impacting the fields of biology and biotechnology at large, such methods have the downside of high demands in terms of computing power and runtime, hampering their applicability to large datasets. Here, we present NetSurfP-3.0, a tool for predicting solvent accessibility, secondary structure, structural disorder and backbone dihedral angles for each residue of an amino acid sequence. This NetSurfP update exploits recent advances in pre-trained protein language models to drastically improve the runtime of its predecessor by two orders of magnitude, while displaying similar prediction performance. We assessed the accuracy of NetSurfP-3.0 on several independent test datasets and found it to consistently produce state-of-the-art predictions for each of its output features, with a runtime that is up to to 600 times faster than the most commonly available methods performing the same tasks. The tool is freely available as a web server with a user-friendly interface to navigate the results, as well as a standalone downloadable package.
    Matched MeSH terms: Computational Biology/methods
  11. Ng CL, Lim TS, Choong YS
    Mol Biotechnol, 2024 Apr;66(4):568-581.
    PMID: 37742298 DOI: 10.1007/s12033-023-00885-x
    Since the advent of hybridoma technology in the year 1975, it took a decade to witness the first approved monoclonal antibody Orthoclone OKT39 (muromonab-CD3) in the year 1986. Since then, continuous strides have been made to engineer antibodies for specific desired effects. The engineering efforts were not confined to only the variable domains of the antibody but also included the fragment crystallizable (Fc) region that influences the immune response and serum half-life. Engineering of the Fc fragment would have a profound effect on the therapeutic dose, antibody-dependent cell-mediated cytotoxicity as well as antibody-dependent cellular phagocytosis. The integration of computational techniques into antibody engineering designs has allowed for the generation of testable hypotheses and guided the rational antibody design framework prior to further experimental evaluations. In this article, we discuss the recent works in the Fc-fused molecule design that involves computational techniques. We also summarize the usefulness of in silico techniques to aid Fc-fused molecule design and analysis for the therapeutics application.
    Matched MeSH terms: Computational Biology/methods
  12. Zhang H, Mo Y, Wang L, Zhang H, Wu S, Sandai D, et al.
    Front Immunol, 2024;15:1339647.
    PMID: 38660311 DOI: 10.3389/fimmu.2024.1339647
    INTRODUCTION: Over the past decades, immune dysregulation has been consistently demonstrated being common charactoristics of endometriosis (EM) and Inflammatory Bowel Disease (IBD) in numerous studies. However, the underlying pathological mechanisms remain unknown. In this study, bioinformatics techniques were used to screen large-scale gene expression data for plausible correlations at the molecular level in order to identify common pathogenic pathways between EM and IBD.

    METHODS: Based on the EM transcriptomic datasets GSE7305 and GSE23339, as well as the IBD transcriptomic datasets GSE87466 and GSE126124, differential gene analysis was performed using the limma package in the R environment. Co-expressed differentially expressed genes were identified, and a protein-protein interaction (PPI) network for the differentially expressed genes was constructed using the 11.5 version of the STRING database. The MCODE tool in Cytoscape facilitated filtering out protein interaction subnetworks. Key genes in the PPI network were identified through two topological analysis algorithms (MCC and Degree) from the CytoHubba plugin. Upset was used for visualization of these key genes. The diagnostic value of gene expression levels for these key genes was assessed using the Receiver Operating Characteristic (ROC) curve and Area Under the Curve (AUC) The CIBERSORT algorithm determined the infiltration status of 22 immune cell subtypes, exploring differences between EM and IBD patients in both control and disease groups. Finally, different gene expression trends shared by EM and IBD were input into CMap to identify small molecule compounds with potential therapeutic effects.

    RESULTS: 113 differentially expressed genes (DEGs) that were co-expressed in EM and IBD have been identified, comprising 28 down-regulated genes and 86 up-regulated genes. The co-expression differential gene of EM and IBD in the functional enrichment analyses focused on immune response activation, circulating immunoglobulin-mediated humoral immune response and humoral immune response. Five hub genes (SERPING1、VCAM1、CLU、C3、CD55) were identified through the Protein-protein Interaction network and MCODE.High Area Under the Curve (AUC) values of Receiver Operating Characteristic (ROC) curves for 5hub genes indicate the predictive ability for disease occurrence.These hub genes could be used as potential biomarkers for the development of EM and IBD. Furthermore, the CMap database identified a total of 9 small molecule compounds (TTNPB、CAY-10577、PD-0325901 etc.) targeting therapeutic genes for EM and IBD.

    DISCUSSION: Our research revealed common pathogenic mechanisms between EM and IBD, particularly emphasizing immune regulation and cell signalling, indicating the significance of immune factors in the occurence and progression of both diseases. By elucidating shared mechanisms, our study provides novel avenues for the prevention and treatment of EM and IBD.

    Matched MeSH terms: Computational Biology/methods
  13. Axtner J, Crampton-Platt A, Hörig LA, Mohamed A, Xu CCY, Yu DW, et al.
    Gigascience, 2019 Apr 01;8(4).
    PMID: 30997489 DOI: 10.1093/gigascience/giz029
    BACKGROUND: The use of environmental DNA for species detection via metabarcoding is growing rapidly. We present a co-designed lab workflow and bioinformatic pipeline to mitigate the 2 most important risks of environmental DNA use: sample contamination and taxonomic misassignment. These risks arise from the need for polymerase chain reaction (PCR) amplification to detect the trace amounts of DNA combined with the necessity of using short target regions due to DNA degradation.

    FINDINGS: Our high-throughput workflow minimizes these risks via a 4-step strategy: (i) technical replication with 2 PCR replicates and 2 extraction replicates; (ii) using multi-markers (12S,16S,CytB); (iii) a "twin-tagging," 2-step PCR protocol; and (iv) use of the probabilistic taxonomic assignment method PROTAX, which can account for incomplete reference databases. Because annotation errors in the reference sequences can result in taxonomic misassignment, we supply a protocol for curating sequence datasets. For some taxonomic groups and some markers, curation resulted in >50% of sequences being deleted from public reference databases, owing to (i) limited overlap between our target amplicon and reference sequences, (ii) mislabelling of reference sequences, and (iii) redundancy. Finally, we provide a bioinformatic pipeline to process amplicons and conduct PROTAX assignment and tested it on an invertebrate-derived DNA dataset from 1,532 leeches from Sabah, Malaysia. Twin-tagging allowed us to detect and exclude sequences with non-matching tags. The smallest DNA fragment (16S) amplified most frequently for all samples but was less powerful for discriminating at species rank. Using a stringent and lax acceptance criterion we found 162 (stringent) and 190 (lax) vertebrate detections of 95 (stringent) and 109 (lax) leech samples.

    CONCLUSIONS: Our metabarcoding workflow should help research groups increase the robustness of their results and therefore facilitate wider use of environmental and invertebrate-derived DNA, which is turning into a valuable source of ecological and conservation information on tetrapods.

    Matched MeSH terms: Computational Biology/methods
  14. Naseer S, Ali RF, Fati SM, Muneer A
    Sci Rep, 2022 01 07;12(1):128.
    PMID: 34996975 DOI: 10.1038/s41598-021-03895-4
    In biological systems, Glutamic acid is a crucial amino acid which is used in protein biosynthesis. Carboxylation of glutamic acid is a significant post-translational modification which plays important role in blood coagulation by activating prothrombin to thrombin. Contrariwise, 4-carboxy-glutamate is also found to be involved in diseases including plaque atherosclerosis, osteoporosis, mineralized heart valves, bone resorption and serves as biomarker for onset of these diseases. Owing to the pathophysiological significance of 4-carboxyglutamate, its identification is important to better understand pathophysiological systems. The wet lab identification of prospective 4-carboxyglutamate sites is costly, laborious and time consuming due to inherent difficulties of in-vivo, ex-vivo and in vitro experiments. To supplement these experiments, we proposed, implemented, and evaluated a different approach to develop 4-carboxyglutamate site predictors using pseudo amino acid compositions (PseAAC) and deep neural networks (DNNs). Our approach does not require any feature extraction and employs deep neural networks to learn feature representation of peptide sequences and performing classification thereof. Proposed approach is validated using standard performance evaluation metrics. Among different deep neural networks, convolutional neural network-based predictor achieved best scores on independent dataset with accuracy of 94.7%, AuC score of 0.91 and F1-score of 0.874 which shows the promise of proposed approach. The iCarboxE-Deep server is deployed at https://share.streamlit.io/sheraz-n/carboxyglutamate/app.py .
    Matched MeSH terms: Computational Biology*
  15. Fatumo S, Ebenezer TE, Ekenna C, Isewon I, Ahmad U, Adetunji C, et al.
    PMID: 32742665 DOI: 10.1017/gheg.2020.3
    Africa plays a central importance role in the human origins, and disease susceptibility, agriculture and biodiversity conservation. Nigeria as the most populous and most diverse country in Africa, owing to its 250 ethnic groups and over 500 different native languages is imperative to any global genomic initiative. The newly inaugurated Nigerian Bioinformatics and Genomics Network (NBGN) becomes necessary to facilitate research collaborative activities and foster opportunities for skills' development amongst Nigerian bioinformatics and genomics investigators. NBGN aims to advance and sustain the fields of genomics and bioinformatics in Nigeria by serving as a vehicle to foster collaboration, provision of new opportunities for interactions between various interdisciplinary subfields of genomics, computational biology and bioinformatics as this will provide opportunities for early career researchers. To provide the foundation for sustainable collaborations, the network organises conferences, workshops, trainings and create opportunities for collaborative research studies and internships, recognise excellence, openly share information and create opportunities for more Nigerians to develop the necessary skills to exceed in genomics and bioinformatics. NBGN currently has attracted more than 650 members around the world. Research collaborations between Nigeria, Africa and the West will grow and all stakeholders, including funding partners, African scientists, researchers across the globe, physicians and patients will be the eventual winners. The exponential membership growth and diversity of research interests of NBGN just within weeks of its establishment and the unanticipated attendance of its activities suggest the significant importance of the network to bioinformatics and genomics research in Nigeria.
    Matched MeSH terms: Computational Biology*
  16. Khor BY, Tye GJ, Lim TS, Choong YS
    PMID: 26338054 DOI: 10.1186/s12976-015-0014-1
    Protein structure prediction from amino acid sequence has been one of the most challenging aspects in computational structural biology despite significant progress in recent years showed by critical assessment of protein structure prediction (CASP) experiments. When experimentally determined structures are unavailable, the predictive structures may serve as starting points to study a protein. If the target protein consists of homologous region, high-resolution (typically <1.5 Å) model can be built via comparative modelling. However, when confronted with low sequence similarity of the target protein (also known as twilight-zone protein, sequence identity with available templates is less than 30%), the protein structure prediction has to be initiated from scratch. Traditionally, twilight-zone proteins can be predicted via threading or ab initio method. Based on the current trend, combination of different methods brings an improved success in the prediction of twilight-zone proteins. In this mini review, the methods, progresses and challenges for the prediction of twilight-zone proteins were discussed.
    Matched MeSH terms: Computational Biology/methods*
  17. Ahmad RM, Ali BR, Al-Jasmi F, Al Dhaheri N, Al Turki S, Kizhakkedath P, et al.
    Hum Genomics, 2024 Sep 11;18(1):99.
    PMID: 39256852 DOI: 10.1186/s40246-024-00667-9
    Single nucleotide variants (SNVs) can exert substantial and extremely variable impacts on various cellular functions, making accurate predictions of their consequences challenging, albeit crucial especially in clinical settings such as in oncology. Laboratory-based experimental methods for assessing these effects are time-consuming and often impractical, highlighting the importance of in-silico tools for variant impact prediction. However, the performance metrics of currently available tools on breast cancer missense variants from benchmarking databases have not been thoroughly investigated, creating a knowledge gap in the accurate prediction of pathogenicity. In this study, the benchmarking datasets ClinVar and HGMD were used to evaluate 21 Artificial Intelligence (AI)-derived in-silico tools. Missense variants in breast cancer genes were extracted from ClinVar and HGMD professional v2023.1. The HGMD dataset focused on pathogenic variants only, to ensure balance, benign variants for the same genes were included from the ClinVar database. Interestingly, our analysis of both datasets revealed variants across genes with varying penetrance levels like low and moderate in addition to high, reinforcing the value of disease-specific tools. The top-performing tools on ClinVar dataset identified were MutPred (Accuracy = 0.73), Meta-RNN (Accuracy = 0.72), ClinPred (Accuracy = 0.71), Meta-SVM, REVEL, and Fathmm-XF (Accuracy = 0.70). While on HGMD dataset they were ClinPred (Accuracy = 0.72), MetaRNN (Accuracy = 0.71), CADD (Accuracy = 0.69), Fathmm-MKL (Accuracy = 0.68), and Fathmm-XF (Accuracy = 0.67). These findings offer clinicians and researchers valuable insights for selecting, improving, and developing effective in-silico tools for breast cancer pathogenicity prediction. Bridging this knowledge gap contributes to advancing precision medicine and enhancing diagnostic and therapeutic approaches for breast cancer patients with potential implications for other conditions.
    Matched MeSH terms: Computational Biology/methods
  18. Tharanga S, Ünlü ES, Hu Y, Sjaugi MF, Çelik MA, Hekimoğlu H, et al.
    Brief Bioinform, 2024 Nov 22;26(1).
    PMID: 39592151 DOI: 10.1093/bib/bbae607
    Sequence diversity is one of the major challenges in the design of diagnostic, prophylactic, and therapeutic interventions against viruses. DiMA is a novel tool that is big data-ready and designed to facilitate the dissection of sequence diversity dynamics for viruses. DiMA stands out from other diversity analysis tools by offering various unique features. DiMA provides a quantitative overview of sequence (DNA/RNA/protein) diversity by use of Shannon's entropy corrected for size bias, applied via a user-defined k-mer sliding window to an input alignment file, and each k-mer position is dissected to various diversity motifs. The motifs are defined based on the probability of distinct sequences at a given k-mer alignment position, whereby an index is the predominant sequence, while all the others are (total) variants to the index. The total variants are sub-classified into the major (most common) variant, minor variants (occurring more than once and of incidence lower than the major), and the unique (singleton) variants. DiMA allows user-defined, sequence metadata enrichment for analyses of the motifs. The application of DiMA was demonstrated for the alignment data of the relatively conserved Spike protein (2,106,985 sequences) of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and the relatively highly diverse pol gene (2637) of the human immunodeficiency virus-1 (HIV-1). The tool is publicly available as a web server (https://dima.bezmialem.edu.tr), as a Python library (via PyPi) and as a command line client (via GitHub).
    Matched MeSH terms: Computational Biology/methods
  19. Tan YC, Lahiri C
    Front Immunol, 2022;13:900509.
    PMID: 35720310 DOI: 10.3389/fimmu.2022.900509
    In parallel to the uncontrolled use of antibiotics, the emergence of multidrug-resistant bacteria, like Acinetobacter baumannii, has posed a severe threat. A. baumannii predominates in the nosocomial setting due to its ability to persist in hospitals and survive antibiotic treatment, thereby eventually leading to an increasing prevalence and mortality due to its infection. With the increasing spectra of drug resistance and the incessant collapse of newly discovered antibiotics, new therapeutic countermeasures have been in high demand. Hence, recent research has shown favouritism towards the long-term solution of designing vaccines. Therefore, being a realistic alternative strategy to combat this pathogen, anti-A. Baumannii vaccines research has continued unearthing various antigens with variable results over the last decade. Again, other approaches, including pan-genomics, subtractive proteomics, and reverse vaccination strategies, have shown promise for identifying promiscuous core vaccine candidates that resulted in chimeric vaccine constructs. In addition, the integration of basic knowledge of the pathobiology of this drug-resistant bacteria has also facilitated the development of effective multiantigen vaccines. As opposed to the conventional trial-and-error approach, incorporating the in silico methods in recent studies, particularly network analysis, has manifested a great promise in unearthing novel vaccine candidates from the A. baumannii proteome. Some studies have used multiple A. baumannii data sources to build the co-functional networks and analyze them by k-shell decomposition. Additionally, Whole Genomic Protein Interactome (GPIN) analysis has utilized a rational approach for identifying essential proteins and presenting them as vaccines effective enough to combat the deadly pathogenic threats posed by A. baumannii. Others have identified multiple immune nodes using network-based centrality measurements for synergistic antigen combinations for different vaccination strategies. Protein-protein interactions have also been inferenced utilizing structural approaches, such as molecular docking and molecular dynamics simulation. Similar workflows and technologies were employed to unveil novel A. baumannii drug targets, with a similar trend in the increasing influx of in silico techniques. This review integrates the latest knowledge on the development of A. baumannii vaccines while highlighting the in silico methods as the future of such exploratory research. In parallel, we also briefly summarize recent advancements in A. baumannii drug target research.
    Matched MeSH terms: Computational Biology/methods
  20. Shahab M, Iqbal MW, Ahmad A, Alshabrmi FM, Wei DQ, Khan A, et al.
    Comput Biol Med, 2024 Mar;170:108056.
    PMID: 38301512 DOI: 10.1016/j.compbiomed.2024.108056
    The Nipah virus (NPV) is a highly lethal virus, known for its significant fatality rate. The virus initially originated in Malaysia in 1998 and later led to outbreaks in nearby countries such as Bangladesh, Singapore, and India. Currently, there are no specific vaccines available for this virus. The current work employed the reverse vaccinology method to conduct a comprehensive analysis of the entire proteome of the NPV virus. The aim was to identify and choose the most promising antigenic proteins that could serve as potential candidates for vaccine development. We have also designed B and T cell epitopes-based vaccine candidate using immunoinformatics approach. We have identified a total of 5 novel Cytotoxic T Lymphocytes (CTL), 5 Helper T Lymphocytes (HTL), and 6 linear B-cell potential antigenic epitopes which are novel and can be used for further vaccine development against Nipah virus. Then we performed the physicochemical properties, antigenic, immunogenic and allergenicity prediction of the designed vaccine candidate against NPV. Further, Computational analysis indicated that these epitopes possessed highly antigenic properties and were capable of interacting with immune receptors. The designed vaccine were then docked with the human immune receptors, namely TLR-2 and TLR-4 showed robust interaction with the immune receptor. Molecular dynamics simulations demonstrated robust binding and good dynamics. After numerous dosages at varied intervals, computational immune response modeling showed that the immunogenic construct might elicit a significant immune response. In conclusion, the immunogenic construct shows promise in providing protection against NPV, However, further experimental validation is required before moving to clinical trials.
    Matched MeSH terms: Computational Biology/methods
Filters
Contact Us

Please provide feedback to Administrator (afdal@afpm.org.my)

External Links