METHODS: Known PCOS-related proteins (PCOSrp) from PCOSBase and DisGeNET were integrated with protein-protein interactions (PPI) information from Human Integrated Protein-Protein Interaction reference to construct a PCOS PPI network. The network was clustered with DPClusO algorithm to generate clusters, which were evaluated using Fisher's exact test. Pathway enrichment analysis using gProfileR was conducted to identify significant pathways.
RESULTS: The statistical significance of the identified clusters has successfully predicted 138 novel PCOSrp with 61.5% reliability and, based on Cronbach's alpha, this prediction is acceptable. Androgen signalling pathway and leptin signalling pathway were among the significant PCOS-related pathways corroborating the information obtained from the clinical observation, where androgen signalling pathway is responsible in producing male hormones in women with PCOS, whereas leptin signalling pathway is involved in insulin sensitivity.
CONCLUSIONS: These results show that graph cluster analysis can provide additional insight into the pathobiology of PCOS, as the pathways identified as statistically significant correspond to earlier biological studies. Therefore, integrative analysis can reveal unknown mechanisms, which may enable the development of accurate diagnosis and effective treatment in PCOS.
METHOD: In this study, Rad50 mutations were retrieved from SNPeffect 4.0 database and literature. Each of the mutations was analyzed using various bioinformatic analyses such as PredictSNP, MutPred, SNPeffect 4.0, I-Mutant and MuPro to identify its impact on molecular mechanism, biological function and protein stability, respectively.
RESULTS: We identified 103 mostly occurred mutations in the Rad50 protein domains and motifs, which only 42 mutations were classified as most deleterious. These mutations are mainly situated at the specific motifs such as Walker A, Q-loop, Walker B, D-loop and signature motif of the Rad50 protein. Some of these mutations were predicted to negatively affect several important functional sites that play important roles in DNA repair mechanism and cell cycle signaling pathway, highlighting Rad50 crucial role in this process. Interestingly, mutations located at non-conserved regions were predicted to have neutral/non-damaging effects, in contrast with previous experimental studies that showed deleterious effects. This suggests that software used in this study may have limitations in predicting mutations in non-conserved regions, implying further improvement in their algorithm is needed. In conclusion, this study reveals the priority of acid substitution associated with the genetic disorders. This finding highlights the vital roles of certain residues such as K42E, C681A/S, CC684R/S, S1202R, E1232Q and D1238N/A located in Rad50 conserved regions, which can be considered for a more targeted future studies.
Methods: We used known GSL genes to construct a comprehensive GSL co-expression network. This network was analyzed with the DPClusOST algorithm using a density of 0.5. 0.6. 0.7, 0.8, and 0.9. Generating clusters were evaluated using Fisher's exact test to identify GSL gene co-expression clusters. A significance score (SScore) was calculated for each gene based on the generated p-value of Fisher's exact test. SScore was used to perform a receiver operating characteristic (ROC) study to classify possible GSL genes using the ROCR package. ROCR was used in determining the AUC that measured the suitable density value of the cluster for further analysis. Finally, pathway enrichment analysis was conducted using ClueGO to identify significant pathways associated with the GSL clusters.
Results: The density value of 0.8 showed the highest area under the curve (AUC) leading to the selection of thirteen potential GSL genes from the top six significant clusters that include IMDH3, MVP1, T19K24.17, MRSA2, SIR, ASP4, MTO1, At1g21440, HMT3, At3g47420, PS1, SAL1, and At3g14220. A total of Four potential genes (MTO1, SIR, SAL1, and IMDH3) were identified from the pathway enrichment analysis on the significant clusters. These genes are directly related to GSL-associated pathways such as sulfur metabolism and valine, leucine, and isoleucine biosynthesis. This approach demonstrates the ability of the network clustering approach in identifying potential GSL genes which cannot be found from the standard similarity search.
RESULTS: This study attempts to classify Angiosperms using plant sulfur-containing compound (SCC) or sulphated compound information. The SCC dataset of 692 plant species were collected from the comprehensive species-metabolite relationship family (KNApSAck) database. The structural similarity score of metabolite pairs under all possible combinations (plant species-metabolite) were determined and metabolite pairs with a Tanimoto coefficient value > 0.85 were selected for clustering using machine learning algorithm. Metabolite clustering showed association between the similar structural metabolite clusters and metabolite content among the plant species. Phylogenetic tree construction of Angiosperms displayed three major clades, of which, clade 1 and clade 2 represented the eudicots only, and clade 3, a mixture of both eudicots and monocots. The SCC-based construction of Angiosperm phylogeny is a subset of the existing monocot-dicot classification. The majority of eudicots present in clade 1 and 2 were represented by glucosinolate compounds. These clades with SCC may have been a mixture of ancestral species whilst the combinatorial presence of monocot-dicot in clade 3 suggests sulphated-chemical structure diversification in the event of adaptation during evolutionary change.
CONCLUSIONS: Sulphated chemoinformatics informs classification of Angiosperms via machine learning technique.
METHOD: The SARS-CoV receptor structure files (viral structural components) were retrieved from the Protein Data Bank (PDB) database: membrane protein (PDB ID: 3I6G), main protease (PDB ID: 5RE4), and spike glycoproteins (PDB ID: 6VXX and 6VYB). The receptor binding pocket regions were identified by Discovery Studio (BIOVIA) for targeted docking with TBF polyphenols (genistin, kaempferol, mellein, rhoifolin and scutellarein). The ligand and SARS-CoV family receptor structure files were pre-processed using the AutoDock tools. Molecular docking was performed with the Lamarckian genetic algorithm using AutoDock Vina 4.2 software. The best pose (ligand-receptor complex) from the molecular docking analysis was selected based on the minimum binding energy (MBE) and extent of structural interactions, as indicated by BIOVIA visualization tool. The selected complex was validated by a 100 ns MD simulation run using the GROMACS software. The dynamic behaviour and stability of the receptor-ligand complex were evaluated by the root mean square displacement (RMSD), root mean square fluctuation (RMSF), radius of gyration (Rg), solvent accessible surface area (SASA), solvent accessible surface volume (SASV) and number of hydrogen bonds.
RESULTS: At RMSD = 0, the TBF polyphenols showed fairly strong physical interactions with SARS-CoV receptors under all possible combinations. The MBE of TBF polyphenol-bound SARS CoV complexes ranged from -4.6 to -8.3 kcal/mol. Analysis of the structural interactions showed the presence of hydrogen bonds, electrostatic and hydrophobic interactions between the receptor residues (RR) and ligands atoms. Based on the MBE values, the 3I6G-rhoifolin (MBE = -8.3 kcal/mol) and 5RE4-genistin (MBE = -7.6 kcal/mol) complexes were ranked with the least value. However, the latter showed a greater extent of interactions between the RRs and the ligand atoms and thus was further validated by MD simulation. The MD simulation parameters of the 5RE4-genistin complex over a 100 ns run indicated good structural stability with minimal flexibility within genistin binding pocket region. The findings suggest that S. torvum polyphenols hold good therapeutics potential in COVID-19 management.