MyMedR

Displaying publications 61 - 80 of 119 in total

Abstract:

Sort:

From discovery to spread: The evolution and phylogeny of Getah virus

Li YY, Liu H, Fu SH, Li XL, Guo XF, Li MH, et al.

Infect Genet Evol, 2017 11;55:48-55.
PMID: 28827175 DOI: 10.1016/j.meegid.2017.08.016

Getah virus (GETV) was first isolated in Malaysia in 1955. Since then, epidemics in horses and pigs caused by GETV have resulted in huge economic losses. At present, GETV has spread across Eurasia and Southeast Asia, including mainland China, Korea, Japan, Mongolia, and Russia. Data show that the Most Recent Common Ancestor (MRCA) of GETV existed about 145years ago (95% HPD: 75-244) and gradually evolved into four distinct evolutionary populations: Groups I-IV. The MRCA of GETVs in Group III, which includes all GETVs isolated from mosquitoes, pigs, horses, and other animals since the 1960s (from latitude 19°N to 60°N), existed about 51years ago (95% HPD: 51-72). Group III is responsible for most viral epidemics among domestic animals. An analysis of the GETV E2 protein sequence and structure revealed seven common amino acid mutation sites. These sites are responsible for the structural and electrostatic differences detected between widespread Group III isolates and the prototype strain MM2021. These differences may account for the recent geographical radiation of the virus. Considering the economic significance of GETV infection in pigs and horses, we recommend the implementation of strict viral screening and monitoring programs.

Matched MeSH terms: Computational Biology/methods
Fulltext First comprehensive in silico analysis of the functional and structural consequences of SNPs in human GalNAc-T1 gene

Mohamoud HS, Hussain MR, El-Harouni AA, Shaik NA, Qasmi ZU, Merican AF, et al.

Comput Math Methods Med, 2014;2014:904052.
PMID: 24723968 DOI: 10.1155/2014/904052

GalNAc-T1, a key candidate of GalNac-transferases genes family that is involved in mucin-type O-linked glycosylation pathway, is expressed in most biological tissues and cell types. Despite the reported association of GalNAc-T1 gene mutations with human disease susceptibility, the comprehensive computational analysis of coding, noncoding and regulatory SNPs, and their functional impacts on protein level, still remains unknown. Therefore, sequence- and structure-based computational tools were employed to screen the entire listed coding SNPs of GalNAc-T1 gene in order to identify and characterize them. Our concordant in silico analysis by SIFT, PolyPhen-2, PANTHER-cSNP, and SNPeffect tools, identified the potential nsSNPs (S143P, G258V, and Y414D variants) from 18 nsSNPs of GalNAc-T1. Additionally, 2 regulatory SNPs (rs72964406 and #x26; rs34304568) were also identified in GalNAc-T1 by using FastSNP tool. Using multiple computational approaches, we have systematically classified the functional mutations in regulatory and coding regions that can modify expression and function of GalNAc-T1 enzyme. These genetic variants can further assist in better understanding the wide range of disease susceptibility associated with the mucin-based cell signalling and pathogenic binding, and may help to develop novel therapeutic elements for associated diseases.

Matched MeSH terms: Computational Biology/methods
Fulltext Fine-Mapping of the 1p11.2 Breast Cancer Susceptibility Locus

Horne HN, Chung CC, Zhang H, Yu K, Prokunina-Olsson L, Michailidou K, et al.

PLoS One, 2016;11(8):e0160316.
PMID: 27556229 DOI: 10.1371/journal.pone.0160316

The Cancer Genetic Markers of Susceptibility genome-wide association study (GWAS) originally identified a single nucleotide polymorphism (SNP) rs11249433 at 1p11.2 associated with breast cancer risk. To fine-map this locus, we genotyped 92 SNPs in a 900kb region (120,505,799-121,481,132) flanking rs11249433 in 45,276 breast cancer cases and 48,998 controls of European, Asian and African ancestry from 50 studies in the Breast Cancer Association Consortium. Genotyping was done using iCOGS, a custom-built array. Due to the complicated nature of the region on chr1p11.2: 120,300,000-120,505,798, that lies near the centromere and contains seven duplicated genomic segments, we restricted analyses to 429 SNPs excluding the duplicated regions (42 genotyped and 387 imputed). Per-allelic associations with breast cancer risk were estimated using logistic regression models adjusting for study and ancestry-specific principal components. The strongest association observed was with the original identified index SNP rs11249433 (minor allele frequency (MAF) 0.402; per-allele odds ratio (OR) = 1.10, 95% confidence interval (CI) 1.08-1.13, P = 1.49 x 10-21). The association for rs11249433 was limited to ER-positive breast cancers (test for heterogeneity P≤8.41 x 10-5). Additional analyses by other tumor characteristics showed stronger associations with moderately/well differentiated tumors and tumors of lobular histology. Although no significant eQTL associations were observed, in silico analyses showed that rs11249433 was located in a region that is likely a weak enhancer/promoter. Fine-mapping analysis of the 1p11.2 breast cancer susceptibility locus confirms this region to be limited to risk to cancers that are ER-positive.

Matched MeSH terms: Computational Biology/methods
Fulltext Extracellular Vesicle-derived circular RNAs confers chemoresistance in Colorectal cancer

Hon KW, Ab-Mutalib NS, Abdullah NMA, Jamal R, Abu N

Sci Rep, 2019 Nov 11;9(1):16497.
PMID: 31712601 DOI: 10.1038/s41598-019-53063-y

Chemo-resistance is associated with poor prognosis in colorectal cancer (CRC), with the absence of early biomarker. Exosomes are microvesicles released by body cells for intercellular communication. Circular RNAs (circRNAs) are non-coding RNAs with covalently closed loops and enriched in exosomes. Crosstalk between circRNAs in exosomes and chemo-resistance in CRC remains unknown. This research aims to identify exosomal circRNAs associated with FOLFOX-resistance in CRC. FOLFOX-resistant HCT116 CRC cells (HCT116-R) were generated from parental HCT116 cells (HCT116-P) using periodic drug induction. Exosomes were characterized using transmission electron microscopy (TEM), Zetasizer and Western blot. Our exosomes were translucent cup-shaped structures under TEM with differential expression of TSG101, CD9, and CD63. We performed circRNAs microarray using exosomal RNAs from HCT116-R and HCT116-P cells. We validated our microarray data using serum samples. We performed drug sensitivity assay and cell cycle analysis to characterize selected circRNA after siRNA-knockdown. Using fold change >2 and p

Matched MeSH terms: Computational Biology/methods
Fulltext Evolutionary and genomic analysis of the caleosin/peroxygenase (CLO/PXG) gene/protein families in the Viridiplantae

Rahman F, Hassan M, Rosli R, Almousally I, Hanano A, Murphy DJ

PLoS One, 2018;13(5):e0196669.
PMID: 29771926 DOI: 10.1371/journal.pone.0196669

Bioinformatics analyses of caleosin/peroxygenases (CLO/PXG) demonstrated that these genes are present in the vast majority of Viridiplantae taxa for which sequence data are available. Functionally active CLO/PXG proteins with roles in abiotic stress tolerance and lipid droplet storage are present in some Trebouxiophycean and Chlorophycean green algae but are absent from the small number of sequenced Prasinophyceaen genomes. CLO/PXG-like genes are expressed during dehydration stress in Charophyte algae, a sister clade of the land plants (Embryophyta). CLO/PXG-like sequences are also present in all of the >300 sequenced Embryophyte genomes, where some species contain as many as 10-12 genes that have arisen via selective gene duplication. Angiosperm genomes harbour at least one copy each of two distinct CLO/PX isoforms, termed H (high) and L (low), where H-forms contain an additional C-terminal motif of about 30-50 residues that is absent from L-forms. In contrast, species in other Viridiplantae taxa, including green algae, non-vascular plants, ferns and gymnosperms, contain only one (or occasionally both) of these isoforms per genome. Transcriptome and biochemical data show that CLO/PXG-like genes have complex patterns of developmental and tissue-specific expression. CLO/PXG proteins can associate with cytosolic lipid droplets and/or bilayer membranes. Many of the analysed isoforms also have peroxygenase activity and are involved in oxylipin metabolism. The distribution of CLO/PXG-like genes is consistent with an origin >1 billion years ago in at least two of the earliest diverging groups of the Viridiplantae, namely the Chlorophyta and the Streptophyta, after the Viridiplantae had already diverged from other Archaeplastidal groups such as the Rhodophyta and Glaucophyta. While algal CLO/PXGs have roles in lipid packaging and stress responses, the Embryophyte proteins have a much wider spectrum of roles and may have been instrumental in the colonisation of terrestrial habitats and the subsequent diversification as the major land flora.

Matched MeSH terms: Computational Biology/methods
Fulltext Evidence-based gene models for structural and functional annotations of the oil palm genome

Chan KL, Tatarinova TV, Rosli R, Amiruddin N, Azizi N, Halim MAA, et al.

Biol. Direct, 2017 Sep 08;12(1):21.
PMID: 28886750 DOI: 10.1186/s13062-017-0191-4

BACKGROUND: Oil palm is an important source of edible oil. The importance of the crop, as well as its long breeding cycle (10-12 years) has led to the sequencing of its genome in 2013 to pave the way for genomics-guided breeding. Nevertheless, the first set of gene predictions, although useful, had many fragmented genes. Classification and characterization of genes associated with traits of interest, such as those for fatty acid biosynthesis and disease resistance, were also limited. Lipid-, especially fatty acid (FA)-related genes are of particular interest for the oil palm as they specify oil yields and quality. This paper presents the characterization of the oil palm genome using different gene prediction methods and comparative genomics analysis, identification of FA biosynthesis and disease resistance genes, and the development of an annotation database and bioinformatics tools.
RESULTS: Using two independent gene-prediction pipelines, Fgenesh++ and Seqping, 26,059 oil palm genes with transcriptome and RefSeq support were identified from the oil palm genome. These coding regions of the genome have a characteristic broad distribution of GC3 (fraction of cytosine and guanine in the third position of a codon) with over half the GC3-rich genes (GC3 ≥ 0.75286) being intronless. In comparison, only one-seventh of the oil palm genes identified are intronless. Using comparative genomics analysis, characterization of conserved domains and active sites, and expression analysis, 42 key genes involved in FA biosynthesis in oil palm were identified. For three of them, namely EgFABF, EgFABH and EgFAD3, segmental duplication events were detected. Our analysis also identified 210 candidate resistance genes in six classes, grouped by their protein domain structures.
CONCLUSIONS: We present an accurate and comprehensive annotation of the oil palm genome, focusing on analysis of important categories of genes (GC3-rich and intronless), as well as those associated with important functions, such as FA biosynthesis and disease resistance. The study demonstrated the advantages of having an integrated approach to gene prediction and developed a computational framework for combining multiple genome annotations. These results, available in the oil palm annotation database ( http://palmxplore.mpob.gov.my ), will provide important resources for studies on the genomes of oil palm and related crops.
REVIEWERS: This article was reviewed by Alexander Kel, Igor Rogozin, and Vladimir A. Kuznetsov.

Matched MeSH terms: Computational Biology/methods
Fulltext Evaluation of reference genes at different developmental stages for quantitative real-time PCR in Aedes aegypti

Dzaki N, Ramli KN, Azlan A, Ishak IH, Azzam G

Sci Rep, 2017 03 16;7:43618.
PMID: 28300076 DOI: 10.1038/srep43618

The mosquito Aedes aegypti (Ae. aegypti) is the most notorious vector of illness-causing viruses such as Dengue, Chikugunya, and Zika. Although numerous genetic expression studies utilizing quantitative real-time PCR (qPCR) have been conducted with regards to Ae. aegypti, a panel of genes to be used suitably as references for the purpose of expression-level normalization within this epidemiologically important insect is presently lacking. Here, the usability of seven widely-utilized reference genes i.e. actin (ACT), eukaryotic elongation factor 1 alpha (eEF1α), alpha tubulin (α-tubulin), ribosomal proteins L8, L32 and S17 (RPL8, RPL32 and RPS17), and glyceraldeyde 3-phosphate dehydrogenase (GAPDH) were investigated. Expression patterns of the reference genes were observed in sixteen pre-determined developmental stages and in cell culture. Gene stability was inferred from qPCR data through three freely available algorithms i.e. BestKeeper, geNorm, and NormFinder. The consensus rankings generated from stability values provided by these programs suggest a combination of at least two genes for normalization. ACT and RPS17 are the most dependably expressed reference genes and therefore, we propose an ACT/RPS17 combination for normalization in all Ae. aegypti derived samples. GAPDH performed least desirably, and is thus not a recommended reference gene. This study emphasizes the importance of validating reference genes in Ae. aegypti for qPCR based research.

Matched MeSH terms: Computational Biology/methods
Fulltext Emerging strengths in Asia Pacific bioinformatics

Ranganathan S, Hsu WL, Yang UC, Tan TW

BMC Bioinformatics, 2008;9 Suppl 12:S1.
PMID: 19091008 DOI: 10.1186/1471-2105-9-S12-S1

The 2008 annual conference of the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation set up in 1998, was organized as the 7th International Conference on Bioinformatics (InCoB), jointly with the Bioinformatics and Systems Biology in Taiwan (BIT 2008) Conference, Oct. 20-23, 2008 at Taipei, Taiwan. Besides bringing together scientists from the field of bioinformatics in this region, InCoB is actively involving researchers from the area of systems biology, to facilitate greater synergy between these two groups. Marking the 10th Anniversary of APBioNet, this InCoB 2008 meeting followed on from a series of successful annual events in Bangkok (Thailand), Penang (Malaysia), Auckland (New Zealand), Busan (South Korea), New Delhi (India) and Hong Kong. Additionally, tutorials and the Workshop on Education in Bioinformatics and Computational Biology (WEBCB) immediately prior to the 20th Federation of Asian and Oceanian Biochemists and Molecular Biologists (FAOBMB) Taipei Conference provided ample opportunity for inducting mainstream biochemists and molecular biologists from the region into a greater level of awareness of the importance of bioinformatics in their craft. In this editorial, we provide a brief overview of the peer-reviewed manuscripts accepted for publication herein, grouped into thematic areas. As the regional research expertise in bioinformatics matures, the papers fall into thematic areas, illustrating the specific contributions made by APBioNet to global bioinformatics efforts.

Matched MeSH terms: Computational Biology/methods*
Fulltext Efficient feature selection and classification of protein sequence data in bioinformatics

Iqbal MJ, Faye I, Samir BB, Said AM

ScientificWorldJournal, 2014;2014:173869.
PMID: 25045727 DOI: 10.1155/2014/173869

Bioinformatics has been an emerging area of research for the last three decades. The ultimate aims of bioinformatics were to store and manage the biological data, and develop and analyze computational tools to enhance their understanding. The size of data accumulated under various sequencing projects is increasing exponentially, which presents difficulties for the experimental methods. To reduce the gap between newly sequenced protein and proteins with known functions, many computational techniques involving classification and clustering algorithms were proposed in the past. The classification of protein sequences into existing superfamilies is helpful in predicting the structure and function of large amount of newly discovered proteins. The existing classification results are unsatisfactory due to a huge size of features obtained through various feature encoding methods. In this work, a statistical metric-based feature selection technique has been proposed in order to reduce the size of the extracted feature vector. The proposed method of protein classification shows significant improvement in terms of performance measure metrics: accuracy, sensitivity, specificity, recall, F-measure, and so forth.

Matched MeSH terms: Computational Biology/methods*
Fulltext Drug Discovery of Spinal Muscular Atrophy (SMA) from the Computational Perspective: A Comprehensive Review

Chong LC, Gandhi G, Lee JM, Yeo WWY, Choi SB

Int J Mol Sci, 2021 Aug 20;22(16).
PMID: 34445667 DOI: 10.3390/ijms22168962

Spinal muscular atrophy (SMA), one of the leading inherited causes of child mortality, is a rare neuromuscular disease arising from loss-of-function mutations of the survival motor neuron 1 (SMN1) gene, which encodes the SMN protein. When lacking the SMN protein in neurons, patients suffer from muscle weakness and atrophy, and in the severe cases, respiratory failure and death. Several therapeutic approaches show promise with human testing and three medications have been approved by the U.S. Food and Drug Administration (FDA) to date. Despite the shown promise of these approved therapies, there are some crucial limitations, one of the most important being the cost. The FDA-approved drugs are high-priced and are shortlisted among the most expensive treatments in the world. The price is still far beyond affordable and may serve as a burden for patients. The blooming of the biomedical data and advancement of computational approaches have opened new possibilities for SMA therapeutic development. This article highlights the present status of computationally aided approaches, including in silico drug repurposing, network driven drug discovery as well as artificial intelligence (AI)-assisted drug discovery, and discusses the future prospects.

Matched MeSH terms: Computational Biology/methods
Dissecting Candida albicans Infection from the Perspective of C. albicans Virulence and Omics Approaches on Host-Pathogen Interaction: A Review

Chin VK, Lee TY, Rusliza B, Chong PP

Int J Mol Sci, 2016 Oct 18;17(10).
PMID: 27763544

Candida bloodstream infections remain the most frequent life-threatening fungal disease, with Candida albicans accounting for 70% to 80% of the Candida isolates recovered from infected patients. In nature, Candida species are part of the normal commensal flora in mammalian hosts. However, they can transform into pathogens once the host immune system is weakened or breached. More recently, mortality attributed to Candida infections has continued to increase due to both inherent and acquired drug resistance in Candida, the inefficacy of the available antifungal drugs, tedious diagnostic procedures, and a rising number of immunocompromised patients. Adoption of animal models, viz. minihosts, mice, and zebrafish, has brought us closer to unraveling the pathogenesis and complexity of Candida infection in human hosts, leading towards the discovery of biomarkers and identification of potential therapeutic agents. In addition, the advancement of omics technologies offers a holistic view of the Candida-host interaction in a non-targeted and non-biased manner. Hence, in this review, we seek to summarize past and present milestone findings on C. albicans virulence, adoption of animal models in the study of C. albicans infection, and the application of omics technologies in the study of Candida-host interaction. A profound understanding of the interaction between host defense and pathogenesis is imperative for better design of novel immunotherapeutic strategies in future.

Matched MeSH terms: Computational Biology/methods*
Fulltext Discovery of a new class of inhibitors for the protein arginine deiminase type 4 (PAD4) by structure-based virtual screening

Teo CY, Shave S, Chor AL, Salleh AB, Rahman MB, Walkinshaw MD, et al.

BMC Bioinformatics, 2012;13 Suppl 17:S4.
PMID: 23282142 DOI: 10.1186/1471-2105-13-S17-S4

BACKGROUND: Rheumatoid arthritis (RA) is an autoimmune disease with unknown etiology. Anticitrullinated protein autoantibody has been documented as a highly specific autoantibody associated with RA. Protein arginine deiminase type 4 (PAD4) is the enzyme responsible for catalyzing the conversion of peptidylarginine into peptidylcitrulline. PAD4 is a new therapeutic target for RA treatment. In order to search for inhibitors of PAD4, structure-based virtual screening was performed using LIDAEUS (Ligand discovery at Edinburgh university). Potential inhibitors were screened experimentally by inhibition assays.
RESULTS: Twenty two of the top-ranked water-soluble compounds were selected for inhibitory screening against PAD4. Three compounds showed significant inhibition of PAD4 and their IC50 values were investigated. The structures of the three compounds show no resemblance with previously discovered PAD4 inhibitors, nor with existing drugs for RA treatment.
CONCLUSION: Three compounds were discovered as potential inhibitors of PAD4 by virtual screening. The compounds are commercially available and can be used as scaffolds to design more potent inhibitors against PAD4.

Matched MeSH terms: Computational Biology/methods*
Fulltext Differential Bees Flux Balance Analysis with OptKnock for in silico microbial strains optimization

Choon YW, Mohamad MS, Deris S, Illias RM, Chong CK, Chai LE, et al.

PLoS One, 2014;9(7):e102744.
PMID: 25047076 DOI: 10.1371/journal.pone.0102744

Microbial strains optimization for the overproduction of desired phenotype has been a popular topic in recent years. The strains can be optimized through several techniques in the field of genetic engineering. Gene knockout is a genetic engineering technique that can engineer the metabolism of microbial cells with the objective to obtain desirable phenotypes. However, the complexities of the metabolic networks have made the process to identify the effects of genetic modification on the desirable phenotypes challenging. Furthermore, a vast number of reactions in cellular metabolism often lead to the combinatorial problem in obtaining optimal gene deletion strategy. Basically, the size of a genome-scale metabolic model is usually large. As the size of the problem increases, the computation time increases exponentially. In this paper, we propose Differential Bees Flux Balance Analysis (DBFBA) with OptKnock to identify optimal gene knockout strategies for maximizing the production yield of desired phenotypes while sustaining the growth rate. This proposed method functions by improving the performance of a hybrid of Bees Algorithm and Flux Balance Analysis (BAFBA) by hybridizing Differential Evolution (DE) algorithm into neighborhood searching strategy of BAFBA. In addition, DBFBA is integrated with OptKnock to validate the results for improving the reliability the work. Through several experiments conducted on Escherichia coli, Bacillus subtilis, and Clostridium thermocellum as the model organisms, DBFBA has shown a better performance in terms of computational time, stability, growth rate, and production yield of desired phenotypes compared to the methods used in previous works.

Matched MeSH terms: Computational Biology/methods*
Fulltext Development of a Bioinformatics Framework for Identification and Validation of Genomic Biomarkers and Key Immunopathology Processes and Controllers in Infectious and Non-infectious Severe Inflammatory Response Syndrome

Tong DL, Kempsell KE, Szakmany T, Ball G

Front Immunol, 2020;11:380.
PMID: 32318053 DOI: 10.3389/fimmu.2020.00380

Sepsis is defined as dysregulated host response caused by systemic infection, leading to organ failure. It is a life-threatening condition, often requiring admission to an intensive care unit (ICU). The causative agents and processes involved are multifactorial but are characterized by an overarching inflammatory response, sharing elements in common with severe inflammatory response syndrome (SIRS) of non-infectious origin. Sepsis presents with a range of pathophysiological and genetic features which make clinical differentiation from SIRS very challenging. This may reflect a poor understanding of the key gene inter-activities and/or pathway associations underlying these disease processes. Improved understanding is critical for early differential recognition of sepsis and SIRS and to improve patient management and clinical outcomes. Judicious selection of gene biomarkers suitable for development of diagnostic tests/testing could make differentiation of sepsis and SIRS feasible. Here we describe a methodologic framework for the identification and validation of biomarkers in SIRS, sepsis and septic shock patients, using a 2-tier gene screening, artificial neural network (ANN) data mining technique, using previously published gene expression datasets. Eight key hub markers have been identified which may delineate distinct, core disease processes and which show potential for informing underlying immunological and pathological processes and thus patient stratification and treatment. These do not show sufficient fold change differences between the different disease states to be useful as primary diagnostic biomarkers, but are instrumental in identifying candidate pathways and other associated biomarkers for further exploration.

Matched MeSH terms: Computational Biology/methods*
Detection of copy number variations in epilepsy using exome data

Tsuchida N, Nakashima M, Kato M, Heyman E, Inui T, Haginoya K, et al.

Clin Genet, 2018 03;93(3):577-587.
PMID: 28940419 DOI: 10.1111/cge.13144

Epilepsies are common neurological disorders and genetic factors contribute to their pathogenesis. Copy number variations (CNVs) are increasingly recognized as an important etiology of many human diseases including epilepsy. Whole-exome sequencing (WES) is becoming a standard tool for detecting pathogenic mutations and has recently been applied to detecting CNVs. Here, we analyzed 294 families with epilepsy using WES, and focused on 168 families with no causative single nucleotide variants in known epilepsy-associated genes to further validate CNVs using 2 different CNV detection tools using WES data. We confirmed 18 pathogenic CNVs, and 2 deletions and 2 duplications at chr15q11.2 of clinically unknown significance. Of note, we were able to identify small CNVs less than 10 kb in size, which might be difficult to detect by conventional microarray. We revealed 2 cases with pathogenic CNVs that one of the 2 CNV detection tools failed to find, suggesting that using different CNV tools is recommended to increase diagnostic yield. Considering a relatively high discovery rate of CNVs (18 out of 168 families, 10.7%) and successful detection of CNV with <10 kb in size, CNV detection by WES may be able to surrogate, or at least complement, conventional microarray analysis.

Matched MeSH terms: Computational Biology/methods
Fulltext Deep-WET: a deep learning-based approach for predicting DNA-binding proteins using word embedding techniques with weighted features

Mahmud SMH, Goh KOM, Hosen MF, Nandi D, Shoombuatong W

Sci Rep, 2024 Feb 05;14(1):2961.
PMID: 38316843 DOI: 10.1038/s41598-024-52653-9

DNA-binding proteins (DBPs) play a significant role in all phases of genetic processes, including DNA recombination, repair, and modification. They are often utilized in drug discovery as fundamental elements of steroids, antibiotics, and anticancer drugs. Predicting them poses the most challenging task in proteomics research. Conventional experimental methods for DBP identification are costly and sometimes biased toward prediction. Therefore, developing powerful computational methods that can accurately and rapidly identify DBPs from sequence information is an urgent need. In this study, we propose a novel deep learning-based method called Deep-WET to accurately identify DBPs from primary sequence information. In Deep-WET, we employed three powerful feature encoding schemes containing Global Vectors, Word2Vec, and fastText to encode the protein sequence. Subsequently, these three features were sequentially combined and weighted using the weights obtained from the elements learned through the differential evolution (DE) algorithm. To enhance the predictive performance of Deep-WET, we applied the SHapley Additive exPlanations approach to remove irrelevant features. Finally, the optimal feature subset was input into convolutional neural networks to construct the Deep-WET predictor. Both cross-validation and independent tests indicated that Deep-WET achieved superior predictive performance compared to conventional machine learning classifiers. In addition, in extensive independent test, Deep-WET was effective and outperformed than several state-of-the-art methods for DBP prediction, with accuracy of 78.08%, MCC of 0.559, and AUC of 0.805. This superior performance shows that Deep-WET has a tremendous predictive capacity to predict DBPs. The web server of Deep-WET and curated datasets in this study are available at https://deepwet-dna.monarcatechnical.com/ . The proposed Deep-WET is anticipated to serve the community-wide effort for large-scale identification of potential DBPs.

Matched MeSH terms: Computational Biology/methods
Fulltext DeSigN: connecting gene expression with therapeutics for drug repurposing and development

Lee BK, Tiong KH, Chang JK, Liew CS, Abdul Rahman ZA, Tan AC, et al.

BMC Genomics, 2017 01 25;18(Suppl 1):934.
PMID: 28198666 DOI: 10.1186/s12864-016-3260-7

BACKGROUND: The drug discovery and development pipeline is a long and arduous process that inevitably hampers rapid drug development. Therefore, strategies to improve the efficiency of drug development are urgently needed to enable effective drugs to enter the clinic. Precision medicine has demonstrated that genetic features of cancer cells can be used for predicting drug response, and emerging evidence suggest that gene-drug connections could be predicted more accurately by exploring the cumulative effects of many genes simultaneously.
RESULTS: We developed DeSigN, a web-based tool for predicting drug efficacy against cancer cell lines using gene expression patterns. The algorithm correlates phenotype-specific gene signatures derived from differentially expressed genes with pre-defined gene expression profiles associated with drug response data (IC50) from 140 drugs. DeSigN successfully predicted the right drug sensitivity outcome in four published GEO studies. Additionally, it predicted bosutinib, a Src/Abl kinase inhibitor, as a sensitive inhibitor for oral squamous cell carcinoma (OSCC) cell lines. In vitro validation of bosutinib in OSCC cell lines demonstrated that indeed, these cell lines were sensitive to bosutinib with IC50 of 0.8-1.2 μM. As further confirmation, we demonstrated experimentally that bosutinib has anti-proliferative activity in OSCC cell lines, demonstrating that DeSigN was able to robustly predict drug that could be beneficial for tumour control.
CONCLUSIONS: DeSigN is a robust method that is useful for the identification of candidate drugs using an input gene signature obtained from gene expression analysis. This user-friendly platform could be used to identify drugs with unanticipated efficacy against cancer cell lines of interest, and therefore could be used for the repurposing of drugs, thus improving the efficiency of drug development.

Matched MeSH terms: Computational Biology/methods*
Fulltext De novo assembly, characterization and functional annotation of pineapple fruit transcriptome through massively parallel sequencing

Ong WD, Voo LY, Kumar VS

PLoS One, 2012;7(10):e46937.
PMID: 23091603 DOI: 10.1371/journal.pone.0046937

BACKGROUND: Pineapple (Ananas comosus var. comosus), is an important tropical non-climacteric fruit with high commercial potential. Understanding the mechanism and processes underlying fruit ripening would enable scientists to enhance the improvement of quality traits such as, flavor, texture, appearance and fruit sweetness. Although, the pineapple is an important fruit, there is insufficient transcriptomic or genomic information that is available in public databases. Application of high throughput transcriptome sequencing to profile the pineapple fruit transcripts is therefore needed.
METHODOLOGY/PRINCIPAL FINDINGS: To facilitate this, we have performed transcriptome sequencing of ripe yellow pineapple fruit flesh using Illumina technology. About 4.7 millions Illumina paired-end reads were generated and assembled using the Velvet de novo assembler. The assembly produced 28,728 unique transcripts with a mean length of approximately 200 bp. Sequence similarity search against non-redundant NCBI database identified a total of 16,932 unique transcripts (58.93%) with significant hits. Out of these, 15,507 unique transcripts were assigned to gene ontology terms. Functional annotation against Kyoto Encyclopedia of Genes and Genomes pathway database identified 13,598 unique transcripts (47.33%) which were mapped to 126 pathways. The assembly revealed many transcripts that were previously unknown.
CONCLUSIONS: The unique transcripts derived from this work have rapidly increased of the number of the pineapple fruit mRNA transcripts as it is now available in public databases. This information can be further utilized in gene expression, genomics and other functional genomics studies in pineapple.

Matched MeSH terms: Computational Biology/methods
Fulltext Current progress of immunoinformatics approach harnessed for cellular- and antibody-dependent vaccine design

Kazi A, Chuah C, Majeed ABA, Leow CH, Lim BH, Leow CY

Pathog Glob Health, 2018 05;112(3):123-131.
PMID: 29528265 DOI: 10.1080/20477724.2018.1446773

Immunoinformatics plays a pivotal role in vaccine design, immunodiagnostic development, and antibody production. In the past, antibody design and vaccine development depended exclusively on immunological experiments which are relatively expensive and time-consuming. However, recent advances in the field of immunological bioinformatics have provided feasible tools which can be used to lessen the time and cost required for vaccine and antibody development. This approach allows the selection of immunogenic regions from the pathogen genomes. The ideal regions could be developed as potential vaccine candidates to trigger protective immune responses in the hosts. At present, epitope-based vaccines are attractive concepts which have been successfully trailed to develop vaccines which target rapidly mutating pathogens. In this article, we provide an overview of the current progress of immunoinformatics and their applications in the vaccine design, immune system modeling and therapeutics.

Matched MeSH terms: Computational Biology/methods*
Cupriavidus malaysiensis sp. nov., a novel poly(3-hydroxybutyrate-co-4-hydroxybutyrate) accumulating bacterium isolated from the Malaysian environment

Ramachandran H, Shafie NAH, Sudesh K, Azizan MN, Majid MIA, Amirul AA

Antonie Van Leeuwenhoek, 2018 Mar;111(3):361-372.
PMID: 29022146 DOI: 10.1007/s10482-017-0958-8

Bacterial classification on the basis of a polyphasic approach was conducted on three poly(3 hydroxybutyrate-co-4-hydroxybutyrate) [P(3HB-co-4HB)] accumulating bacterial strains that were isolated from samples collected from Malaysian environments; Kulim Lake, Sg. Pinang river and Sg. Manik paddy field. The Gram-negative, rod-shaped, motile, non-sporulating and non-fermenting bacteria were shown to belong to the genus Cupriavidus of the Betaproteobacteria on the basis of their 16S rRNA gene sequence analyses. The sequence similarity value with their near phylogenetic neighbour, Cupriavidus pauculus LMG3413T, was 98.5%. However, the DNA-DNA hybridization values (8-58%) and ribotyping analysis both enabled these strains to be differentiated from related Cupriavidus species with validly published names. The RiboPrint patterns of the three strains also revealed that the strains were genetically related even though they displayed a clonal diversity. The major cellular fatty acids detected in these strains included C15:0 ISO 2OH/C16:1 ω7c, hexadecanoic (16:0) and cis-11-octadecenoic (C18:1 ω7c). Their G+C contents ranged from 68.0 to 68.6 mol%, and their major isoprenoid quinone was Ubiquinone Q-8. Of these three strains, only strain USMAHM13 (= DSM 25816 = KCTC 32390) was discovered to exhibit yellow pigmentation that is characteristic of the carotenoid family. Their assembled genomes also showed that the three strains were not identical in terms of their genome sizes that were 7.82, 7.95 and 8.70 Mb for strains USMAHM13, USMAA1020 and USMAA2-4, respectively, which are slightly larger than that of Cupriavidus necator H16 (7.42 Mb). The average nucleotide identity (ANI) results indicated that the strains were genetically related and the genome pairs belong to the same species. On the basis of the results obtained in this study, the three strains are considered to represent a novel species for which the name Cupriavidus malaysiensis sp. nov. is proposed. The type strain of the species is USMAA1020T (= DSM 19416T = KCTC 32390T).

Matched MeSH terms: Computational Biology/methods

Filters

Please provide feedback to Administrator (afdal@afpm.org.my)

External Links