Displaying publications 21 - 40 of 119 in total

Abstract:
Sort:
  1. Tan CH, Tan KY, Fung SY, Tan NH
    BMC Genomics, 2015;16:687.
    PMID: 26358635 DOI: 10.1186/s12864-015-1828-2
    The king cobra (Ophiophagus hannah) is widely distributed throughout many parts of Asia. This study aims to investigate the complexity of Malaysian Ophiophagus hannah (MOh) venom for a better understanding of king cobra venom variation and its envenoming pathophysiology. The venom gland transcriptome was investigated using the Illumina HiSeq™ platform, while the venom proteome was profiled by 1D-SDS-PAGE-nano-ESI-LCMS/MS.
    Matched MeSH terms: Computational Biology/methods
  2. Parsons MT, Tudini E, Li H, Hahnen E, Wappenschmidt B, Feliubadaló L, et al.
    Hum Mutat, 2019 Sep;40(9):1557-1578.
    PMID: 31131967 DOI: 10.1002/humu.23818
    The multifactorial likelihood analysis method has demonstrated utility for quantitative assessment of variant pathogenicity for multiple cancer syndrome genes. Independent data types currently incorporated in the model for assessing BRCA1 and BRCA2 variants include clinically calibrated prior probability of pathogenicity based on variant location and bioinformatic prediction of variant effect, co-segregation, family cancer history profile, co-occurrence with a pathogenic variant in the same gene, breast tumor pathology, and case-control information. Research and clinical data for multifactorial likelihood analysis were collated for 1,395 BRCA1/2 predominantly intronic and missense variants, enabling classification based on posterior probability of pathogenicity for 734 variants: 447 variants were classified as (likely) benign, and 94 as (likely) pathogenic; and 248 classifications were new or considerably altered relative to ClinVar submissions. Classifications were compared with information not yet included in the likelihood model, and evidence strengths aligned to those recommended for ACMG/AMP classification codes. Altered mRNA splicing or function relative to known nonpathogenic variant controls were moderately to strongly predictive of variant pathogenicity. Variant absence in population datasets provided supporting evidence for variant pathogenicity. These findings have direct relevance for BRCA1 and BRCA2 variant evaluation, and justify the need for gene-specific calibration of evidence types used for variant classification.
    Matched MeSH terms: Computational Biology/methods*
  3. Charoenkwan P, Chotpatiwetchkul W, Lee VS, Nantasenamat C, Shoombuatong W
    Sci Rep, 2021 Dec 10;11(1):23782.
    PMID: 34893688 DOI: 10.1038/s41598-021-03293-w
    Owing to their ability to maintain a thermodynamically stable fold at extremely high temperatures, thermophilic proteins (TTPs) play a critical role in basic research and a variety of applications in the food industry. As a result, the development of computation models for rapidly and accurately identifying novel TTPs from a large number of uncharacterized protein sequences is desirable. In spite of existing computational models that have already been developed for characterizing thermophilic proteins, their performance and interpretability remain unsatisfactory. We present a novel sequence-based thermophilic protein predictor, termed SCMTPP, for improving model predictability and interpretability. First, an up-to-date and high-quality dataset consisting of 1853 TPPs and 3233 non-TPPs was compiled from published literature. Second, the SCMTPP predictor was created by combining the scoring card method (SCM) with estimated propensity scores of g-gap dipeptides. Benchmarking experiments revealed that SCMTPP had a cross-validation accuracy of 0.883, which was comparable to that of a support vector machine-based predictor (0.906-0.910) and 2-17% higher than that of commonly used machine learning models. Furthermore, SCMTPP outperformed the state-of-the-art approach (ThermoPred) on the independent test dataset, with accuracy and MCC of 0.865 and 0.731, respectively. Finally, the SCMTPP-derived propensity scores were used to elucidate the critical physicochemical properties for protein thermostability enhancement. In terms of interpretability and generalizability, comparative results showed that SCMTPP was effective for identifying and characterizing TPPs. We had implemented the proposed predictor as a user-friendly online web server at http://pmlabstack.pythonanywhere.com/SCMTPP in order to allow easy access to the model. SCMTPP is expected to be a powerful tool for facilitating community-wide efforts to identify TPPs on a large scale and guiding experimental characterization of TPPs.
    Matched MeSH terms: Computational Biology/methods*
  4. Mahmud SMH, Goh KOM, Hosen MF, Nandi D, Shoombuatong W
    Sci Rep, 2024 Feb 05;14(1):2961.
    PMID: 38316843 DOI: 10.1038/s41598-024-52653-9
    DNA-binding proteins (DBPs) play a significant role in all phases of genetic processes, including DNA recombination, repair, and modification. They are often utilized in drug discovery as fundamental elements of steroids, antibiotics, and anticancer drugs. Predicting them poses the most challenging task in proteomics research. Conventional experimental methods for DBP identification are costly and sometimes biased toward prediction. Therefore, developing powerful computational methods that can accurately and rapidly identify DBPs from sequence information is an urgent need. In this study, we propose a novel deep learning-based method called Deep-WET to accurately identify DBPs from primary sequence information. In Deep-WET, we employed three powerful feature encoding schemes containing Global Vectors, Word2Vec, and fastText to encode the protein sequence. Subsequently, these three features were sequentially combined and weighted using the weights obtained from the elements learned through the differential evolution (DE) algorithm. To enhance the predictive performance of Deep-WET, we applied the SHapley Additive exPlanations approach to remove irrelevant features. Finally, the optimal feature subset was input into convolutional neural networks to construct the Deep-WET predictor. Both cross-validation and independent tests indicated that Deep-WET achieved superior predictive performance compared to conventional machine learning classifiers. In addition, in extensive independent test, Deep-WET was effective and outperformed than several state-of-the-art methods for DBP prediction, with accuracy of 78.08%, MCC of 0.559, and AUC of 0.805. This superior performance shows that Deep-WET has a tremendous predictive capacity to predict DBPs. The web server of Deep-WET and curated datasets in this study are available at https://deepwet-dna.monarcatechnical.com/ . The proposed Deep-WET is anticipated to serve the community-wide effort for large-scale identification of potential DBPs.
    Matched MeSH terms: Computational Biology/methods
  5. Sabetian S, Shamsir MS
    BMC Syst Biol, 2015;9:37.
    PMID: 26187737 DOI: 10.1186/s12918-015-0186-7
    Sperm-egg interaction defect is a significant cause of in-vitro fertilization failure for infertile cases. Numerous molecular interactions in the form of protein-protein interactions mediate the sperm-egg membrane interaction process. Recent studies have demonstrated that in addition to experimental techniques, computational methods, namely protein interaction network approach, can address protein-protein interactions between human sperm and egg. Up to now, no drugs have been detected to treat sperm-egg interaction disorder, and the initial step in drug discovery research is finding out essential proteins or drug targets for a biological process. The main purpose of this study is to identify putative drug targets for human sperm-egg interaction deficiency and consider if the detected essential proteins are targets for any known drugs using protein-protein interaction network and ingenuity pathway analysis.
    Matched MeSH terms: Computational Biology/methods*
  6. Chew TH, Joyce-Tan KH, Akma F, Shamsir MS
    Bioinformatics, 2011 May 1;27(9):1320-1.
    PMID: 21398666 DOI: 10.1093/bioinformatics/btr109
    birgHPC, a bootable Linux Live CD has been developed to create high-performance clusters for bioinformatics and molecular dynamics studies using any Local Area Network (LAN)-networked computers. birgHPC features automated hardware and slots detection as well as provides a simple job submission interface. The latest versions of GROMACS, NAMD, mpiBLAST and ClustalW-MPI can be run in parallel by simply booting the birgHPC CD or flash drive from the head node, which immediately positions the rest of the PCs on the network as computing nodes. Thus, a temporary, affordable, scalable and high-performance computing environment can be built by non-computing-based researchers using low-cost commodity hardware.
    Matched MeSH terms: Computational Biology/methods*
  7. Ng XY, Rosdi BA, Shahrudin S
    Biomed Res Int, 2015;2015:212715.
    PMID: 25802839 DOI: 10.1155/2015/212715
    This study concerns an attempt to establish a new method for predicting antimicrobial peptides (AMPs) which are important to the immune system. Recently, researchers are interested in designing alternative drugs based on AMPs because they have found that a large number of bacterial strains have become resistant to available antibiotics. However, researchers have encountered obstacles in the AMPs designing process as experiments to extract AMPs from protein sequences are costly and require a long set-up time. Therefore, a computational tool for AMPs prediction is needed to resolve this problem. In this study, an integrated algorithm is newly introduced to predict AMPs by integrating sequence alignment and support vector machine- (SVM-) LZ complexity pairwise algorithm. It was observed that, when all sequences in the training set are used, the sensitivity of the proposed algorithm is 95.28% in jackknife test and 87.59% in independent test, while the sensitivity obtained for jackknife test and independent test is 88.74% and 78.70%, respectively, when only the sequences that has less than 70% similarity are used. Applying the proposed algorithm may allow researchers to effectively predict AMPs from unknown protein peptide sequences with higher sensitivity.
    Matched MeSH terms: Computational Biology/methods
  8. Kaur H, Ahmad M, Scaria V
    Interdiscip Sci, 2016 Mar;8(1):95-101.
    PMID: 26298582 DOI: 10.1007/s12539-015-0273-x
    There is emergence of multidrug-resistant Salmonella enterica serotype typhi in pandemic proportions throughout the world, and therefore, there is a necessity to speed up the discovery of novel molecules having different modes of action and also less influenced by the resistance formation that would be used as drug for the treatment of salmonellosis particularly typhoid fever. The PhoP regulon is well studied and has now been shown to be a critical regulator of number of gene expressions which are required for intracellular survival of S. enterica and pathophysiology of disease like typhoid. The evident roles of two-component PhoP-/PhoQ-regulated products in salmonella virulence have motivated attempts to target them therapeutically. Although the discovery process of biologically active compounds for the treatment of typhoid relies on hit-finding procedure, using high-throughput screening technology alone is very expensive, as well as time consuming when performed on large scales. With the recent advancement in combinatorial chemistry and contemporary technique for compounds synthesis, there are more and more compounds available which give ample growth of diverse compound library, but the time and endeavor required to screen these unfocused massive and diverse library have been slightly reduced in the past years. Hence, there is demand to improve the high-quality hits and success rate for high-throughput screening that required focused and biased compound library toward the particular target. Therefore, we still need an advantageous and expedient method to prioritize the molecules that will be utilized for biological screens, which saves time and is also inexpensive. In this concept, in silico methods like machine learning are widely applicable technique used to build computational model for high-throughput virtual screens to prioritize molecules for advance study. Furthermore, in computational analysis, we extended our study to identify the common enriched structural entities among the biologically active compound toward finding out the privileged scaffold.
    Matched MeSH terms: Computational Biology/methods*
  9. Muniyandi RC, Zin AM, Sanders JW
    Biosystems, 2013 Dec;114(3):219-26.
    PMID: 24120990 DOI: 10.1016/j.biosystems.2013.09.008
    This paper presents a method to convert the deterministic, continuous representation of a biological system by ordinary differential equations into a non-deterministic, discrete membrane computation. The dynamics of the membrane computation is governed by rewrite rules operating at certain rates. That has the advantage of applying accurately to small systems, and to expressing rates of change that are determined locally, by region, but not necessary globally. Such spatial information augments the standard differentiable approach to provide a more realistic model. A biological case study of the ligand-receptor network of protein TGF-β is used to validate the effectiveness of the conversion method. It demonstrates the sense in which the behaviours and properties of the system are better preserved in the membrane computing model, suggesting that the proposed conversion method may prove useful for biological systems in particular.
    Matched MeSH terms: Computational Biology/methods*
  10. Iqbal MJ, Faye I, Samir BB, Said AM
    ScientificWorldJournal, 2014;2014:173869.
    PMID: 25045727 DOI: 10.1155/2014/173869
    Bioinformatics has been an emerging area of research for the last three decades. The ultimate aims of bioinformatics were to store and manage the biological data, and develop and analyze computational tools to enhance their understanding. The size of data accumulated under various sequencing projects is increasing exponentially, which presents difficulties for the experimental methods. To reduce the gap between newly sequenced protein and proteins with known functions, many computational techniques involving classification and clustering algorithms were proposed in the past. The classification of protein sequences into existing superfamilies is helpful in predicting the structure and function of large amount of newly discovered proteins. The existing classification results are unsatisfactory due to a huge size of features obtained through various feature encoding methods. In this work, a statistical metric-based feature selection technique has been proposed in order to reduce the size of the extracted feature vector. The proposed method of protein classification shows significant improvement in terms of performance measure metrics: accuracy, sensitivity, specificity, recall, F-measure, and so forth.
    Matched MeSH terms: Computational Biology/methods*
  11. Hosseinpoor AR, Nambiar D, Schlotheuber A, Reidpath D, Ross Z
    BMC Med Res Methodol, 2016 10 19;16(1):141.
    PMID: 27760520
    BACKGROUND: It is widely recognised that the pursuit of sustainable development cannot be accomplished without addressing inequality, or observed differences between subgroups of a population. Monitoring health inequalities allows for the identification of health topics where major group differences exist, dimensions of inequality that must be prioritised to effect improvements in multiple health domains, and also population subgroups that are multiply disadvantaged. While availability of data to monitor health inequalities is gradually improving, there is a commensurate need to increase, within countries, the technical capacity for analysis of these data and interpretation of results for decision-making. Prior efforts to build capacity have yielded demand for a toolkit with the computational ability to display disaggregated data and summary measures of inequality in an interactive and customisable fashion that would facilitate interpretation and reporting of health inequality in a given country.

    METHODS: To answer this demand, the Health Equity Assessment Toolkit (HEAT), was developed between 2014 and 2016. The software, which contains the World Health Organization's Health Equity Monitor database, allows the assessment of inequalities within a country using over 30 reproductive, maternal, newborn and child health indicators and five dimensions of inequality (economic status, education, place of residence, subnational region and child's sex, where applicable).

    RESULTS/CONCLUSION: HEAT was beta-tested in 2015 as part of ongoing capacity building workshops on health inequality monitoring. This is the first and only application of its kind; further developments are proposed to introduce an upload data feature, translate it into different languages and increase interactivity of the software. This article will present the main features and functionalities of HEAT and discuss its relevance and use for health inequality monitoring.

    Matched MeSH terms: Computational Biology/methods*
  12. Zhao K, Ishida Y, Green CE, Davidson AG, Sitam FAT, Donnelly CL, et al.
    J Hered, 2019 12 17;110(7):761-768.
    PMID: 31674643 DOI: 10.1093/jhered/esz058
    Illegal hunting is a major threat to the elephants of Africa, with more elephants killed by poachers than die from natural causes. DNA from tusks has been used to infer the source populations for confiscated ivory, relying on nuclear genetic markers. However, mitochondrial DNA (mtDNA) sequences can also provide information on the geographic origins of elephants due to female elephant philopatry. Here, we introduce the Loxodonta Localizer (LL; www.loxodontalocalizer.org), an interactive software tool that uses a database of mtDNA sequences compiled from previously published studies to provide information on the potential provenance of confiscated ivory. A 316 bp control region sequence, which can be readily generated from DNA extracted from ivory, is used as a query. The software generates a listing of haplotypes reported among 1917 African elephants in 24 range countries, sorted in order of similarity to the query sequence. The African locations from which haplotype sequences have been previously reported are shown on a map. We demonstrate examples of haplotypes reported from only a single locality or country, examine the utility of the program in identifying elephants from countries with varying degrees of sampling, and analyze batches of confiscated ivory. The LL allows for the source of confiscated ivory to be assessed within days, using widely available molecular methods that do not depend on a particular platform or laboratory. The program enables identification of potential regions or localities from which elephants are being poached, with capacity for rapid identification of populations newly or consistently targeted by poachers.
    Matched MeSH terms: Computational Biology/methods
  13. Ismail AM, Mohamad MS, Abdul Majid H, Abas KH, Deris S, Zaki N, et al.
    Biosystems, 2017 Dec;162:81-89.
    PMID: 28951204 DOI: 10.1016/j.biosystems.2017.09.013
    Mathematical modelling is fundamental to understand the dynamic behavior and regulation of the biochemical metabolisms and pathways that are found in biological systems. Pathways are used to describe complex processes that involve many parameters. It is important to have an accurate and complete set of parameters that describe the characteristics of a given model. However, measuring these parameters is typically difficult and even impossible in some cases. Furthermore, the experimental data are often incomplete and also suffer from experimental noise. These shortcomings make it challenging to identify the best-fit parameters that can represent the actual biological processes involved in biological systems. Computational approaches are required to estimate these parameters. The estimation is converted into multimodal optimization problems that require a global optimization algorithm that can avoid local solutions. These local solutions can lead to a bad fit when calibrating with a model. Although the model itself can potentially match a set of experimental data, a high-performance estimation algorithm is required to improve the quality of the solutions. This paper describes an improved hybrid of particle swarm optimization and the gravitational search algorithm (IPSOGSA) to improve the efficiency of a global optimum (the best set of kinetic parameter values) search. The findings suggest that the proposed algorithm is capable of narrowing down the search space by exploiting the feasible solution areas. Hence, the proposed algorithm is able to achieve a near-optimal set of parameters at a fast convergence speed. The proposed algorithm was tested and evaluated based on two aspartate pathways that were obtained from the BioModels Database. The results show that the proposed algorithm outperformed other standard optimization algorithms in terms of accuracy and near-optimal kinetic parameter estimation. Nevertheless, the proposed algorithm is only expected to work well in small scale systems. In addition, the results of this study can be used to estimate kinetic parameter values in the stage of model selection for different experimental conditions.
    Matched MeSH terms: Computational Biology/methods*
  14. Lee Y, Roslan R, Azizan S, Firdaus-Raih M, Ramlan EI
    BMC Bioinformatics, 2016 Oct 28;17(1):438.
    PMID: 27793081
    BACKGROUND: Biological macromolecules (DNA, RNA and proteins) are capable of processing physical or chemical inputs to generate outputs that parallel conventional Boolean logical operators. However, the design of functional modules that will enable these macromolecules to operate as synthetic molecular computing devices is challenging.

    RESULTS: Using three simple heuristics, we designed RNA sensors that can mimic the function of a seven-segment display (SSD). Ten independent and orthogonal sensors representing the numerals 0 to 9 are designed and constructed. Each sensor has its own unique oligonucleotide binding site region that is activated uniquely by a specific input. Each operator was subjected to a stringent in silico filtering. Random sensors were selected and functionally validated via ribozyme self cleavage assays that were visualized via electrophoresis.

    CONCLUSIONS: By utilising simple permutation and randomisation in the sequence design phase, we have developed functional RNA sensors thus demonstrating that even the simplest of computational methods can greatly aid the design phase for constructing functional molecular devices.

    Matched MeSH terms: Computational Biology/methods*
  15. Tang PW, Chua PS, Chong SK, Mohamad MS, Choon YW, Deris S, et al.
    Recent Pat Biotechnol, 2015;9(3):176-97.
    PMID: 27185502
    BACKGROUND: Predicting the effects of genetic modification is difficult due to the complexity of metabolic net- works. Various gene knockout strategies have been utilised to deactivate specific genes in order to determine the effects of these genes on the function of microbes. Deactivation of genes can lead to deletion of certain proteins and functions. Through these strategies, the associated function of a deleted gene can be identified from the metabolic networks.

    METHODS: The main aim of this paper is to review the available techniques in gene knockout strategies for microbial cells. The review is done in terms of their methodology, recent applications in microbial cells. In addition, the advantages and disadvantages of the techniques are compared and discuss and the related patents are also listed as well.

    RESULTS: Traditionally, gene knockout is done through wet lab (in vivo) techniques, which were conducted through laboratory experiments. However, these techniques are costly and time consuming. Hence, various dry lab (in silico) techniques, where are conducted using computational approaches, have been developed to surmount these problem.

    CONCLUSION: The development of numerous techniques for gene knockout in microbial cells has brought many advancements in the study of gene functions. Based on the literatures, we found that the gene knockout strategies currently used are sensibly implemented with regard to their benefits.

    Matched MeSH terms: Computational Biology/methods
  16. Razmara J, Deris SB, Parvizpour S
    Comput Biol Med, 2013 Oct;43(10):1614-21.
    PMID: 24034753 DOI: 10.1016/j.compbiomed.2013.07.022
    The structural comparison of proteins is a vital step in structural biology that is used to predict and analyse a new unknown protein function. Although a number of different techniques have been explored, the study to develop new alternative methods is still an active research area. The present paper introduces a text modelling-based technique for the structural comparison of proteins. The method models the secondary and tertiary structure of proteins in two linear sequences and then applies them to the comparison of two structures. The technique used for pairwise comparison of the sequences has been adopted from computational linguistics and its well-known techniques for analysing and quantifying textual sequences. To this end, an n-gram modelling technique is used to capture regularities between sequences, and then, the cross-entropy concept is employed to measure their similarities. Several experiments are conducted to evaluate the performance of the method and compare it with other commonly used programs. The assessments for information retrieval evaluation demonstrate that the technique has a high running speed, which is similar to other linear encoding methods, such as 3D-BLAST, SARST, and TS-AMIR, whereas its accuracy is comparable to CE and TM-align, which are high accuracy comparison tools. Accordingly, the results demonstrate that the algorithm has high efficiency compared with other state-of-the-art methods.
    Matched MeSH terms: Computational Biology/methods*
  17. Khan S, Zakariah M, Rolfo C, Robrecht L, Palaniappan S
    Oncotarget, 2017 May 09;8(19):30830-30843.
    PMID: 27027344 DOI: 10.18632/oncotarget.8306
    Although the idea of bacteria causing different types of cancer has exploded about century ago, the potential mechanisms of carcinogenesis is still not well established. Many reports showed the involvement of M. hominis in the development of prostate cancer, however, mechanistic approach for growth and development of prostate cancer has been poorly understood. In the current study, we predicted M. hominis proteins targeting in the mitochondria and cytoplasm of host cells and their implication in prostate cancer. A total of 77 and 320 proteins from M. hominis proteome were predicted to target in the mitochondria and cytoplasm of host cells respectively. In particular, various targeted proteins may interfere with normal growth behaviour of host cells, thereby altering the decision of programmed cell death. Furthermore, we investigated possible mechanisms of the mitochondrial and cytoplasmic targeted proteins of M. hominis in etiology of prostate cancer by screening the whole proteome.
    Matched MeSH terms: Computational Biology/methods
  18. Renaud G, Petersen B, Seguin-Orlando A, Bertelsen MF, Waller A, Newton R, et al.
    Sci Adv, 2018 04;4(4):eaaq0392.
    PMID: 29740610 DOI: 10.1126/sciadv.aaq0392
    Donkeys and horses share a common ancestor dating back to about 4 million years ago. Although a high-quality genome assembly at the chromosomal level is available for the horse, current assemblies available for the donkey are limited to moderately sized scaffolds. The absence of a better-quality assembly for the donkey has hampered studies involving the characterization of patterns of genetic variation at the genome-wide scale. These range from the application of genomic tools to selective breeding and conservation to the more fundamental characterization of the genomic loci underlying speciation and domestication. We present a new high-quality donkey genome assembly obtained using the Chicago HiRise assembly technology, providing scaffolds of subchromosomal size. We make use of this new assembly to obtain more accurate measures of heterozygosity for equine species other than the horse, both genome-wide and locally, and to detect runs of homozygosity potentially pertaining to positive selection in domestic donkeys. Finally, this new assembly allowed us to identify fine-scale chromosomal rearrangements between the horse and the donkey that likely played an active role in their divergence and, ultimately, speciation.
    Matched MeSH terms: Computational Biology/methods
  19. Dawson NL, Sillitoe I, Lees JG, Lam SD, Orengo CA
    Methods Mol Biol, 2017;1558:79-110.
    PMID: 28150234 DOI: 10.1007/978-1-4939-6783-4_4
    This chapter describes the generation of the data in the CATH-Gene3D online resource and how it can be used to study protein domains and their evolutionary relationships. Methods will be presented for: comparing protein structures, recognizing homologs, predicting domain structures within protein sequences, and subclassifying superfamilies into functionally pure families, together with a guide on using the webpages.
    Matched MeSH terms: Computational Biology/methods*
  20. Sillitoe I, Bordin N, Dawson N, Waman VP, Ashford P, Scholes HM, et al.
    Nucleic Acids Res, 2021 Jan 08;49(D1):D266-D273.
    PMID: 33237325 DOI: 10.1093/nar/gkaa1079
    CATH (https://www.cathdb.info) identifies domains in protein structures from wwPDB and classifies these into evolutionary superfamilies, thereby providing structural and functional annotations. There are two levels: CATH-B, a daily snapshot of the latest domain structures and superfamily assignments, and CATH+, with additional derived data, such as predicted sequence domains, and functionally coherent sequence subsets (Functional Families or FunFams). The latest CATH+ release, version 4.3, significantly increases coverage of structural and sequence data, with an addition of 65,351 fully-classified domains structures (+15%), providing 500 238 structural domains, and 151 million predicted sequence domains (+59%) assigned to 5481 superfamilies. The FunFam generation pipeline has been re-engineered to cope with the increased influx of data. Three times more sequences are captured in FunFams, with a concomitant increase in functional purity, information content and structural coverage. FunFam expansion increases the structural annotations provided for experimental GO terms (+59%). We also present CATH-FunVar web-pages displaying variations in protein sequences and their proximity to known or predicted functional sites. We present two case studies (1) putative cancer drivers and (2) SARS-CoV-2 proteins. Finally, we have improved links to and from CATH including SCOP, InterPro, Aquaria and 2DProt.
    Matched MeSH terms: Computational Biology/methods
Filters
Contact Us

Please provide feedback to Administrator (afdal@afpm.org.my)

External Links