RESULTS: This study attempts to classify Angiosperms using plant sulfur-containing compound (SCC) or sulphated compound information. The SCC dataset of 692 plant species were collected from the comprehensive species-metabolite relationship family (KNApSAck) database. The structural similarity score of metabolite pairs under all possible combinations (plant species-metabolite) were determined and metabolite pairs with a Tanimoto coefficient value > 0.85 were selected for clustering using machine learning algorithm. Metabolite clustering showed association between the similar structural metabolite clusters and metabolite content among the plant species. Phylogenetic tree construction of Angiosperms displayed three major clades, of which, clade 1 and clade 2 represented the eudicots only, and clade 3, a mixture of both eudicots and monocots. The SCC-based construction of Angiosperm phylogeny is a subset of the existing monocot-dicot classification. The majority of eudicots present in clade 1 and 2 were represented by glucosinolate compounds. These clades with SCC may have been a mixture of ancestral species whilst the combinatorial presence of monocot-dicot in clade 3 suggests sulphated-chemical structure diversification in the event of adaptation during evolutionary change.
CONCLUSIONS: Sulphated chemoinformatics informs classification of Angiosperms via machine learning technique.
METHOD: The SARS-CoV receptor structure files (viral structural components) were retrieved from the Protein Data Bank (PDB) database: membrane protein (PDB ID: 3I6G), main protease (PDB ID: 5RE4), and spike glycoproteins (PDB ID: 6VXX and 6VYB). The receptor binding pocket regions were identified by Discovery Studio (BIOVIA) for targeted docking with TBF polyphenols (genistin, kaempferol, mellein, rhoifolin and scutellarein). The ligand and SARS-CoV family receptor structure files were pre-processed using the AutoDock tools. Molecular docking was performed with the Lamarckian genetic algorithm using AutoDock Vina 4.2 software. The best pose (ligand-receptor complex) from the molecular docking analysis was selected based on the minimum binding energy (MBE) and extent of structural interactions, as indicated by BIOVIA visualization tool. The selected complex was validated by a 100 ns MD simulation run using the GROMACS software. The dynamic behaviour and stability of the receptor-ligand complex were evaluated by the root mean square displacement (RMSD), root mean square fluctuation (RMSF), radius of gyration (Rg), solvent accessible surface area (SASA), solvent accessible surface volume (SASV) and number of hydrogen bonds.
RESULTS: At RMSD = 0, the TBF polyphenols showed fairly strong physical interactions with SARS-CoV receptors under all possible combinations. The MBE of TBF polyphenol-bound SARS CoV complexes ranged from -4.6 to -8.3 kcal/mol. Analysis of the structural interactions showed the presence of hydrogen bonds, electrostatic and hydrophobic interactions between the receptor residues (RR) and ligands atoms. Based on the MBE values, the 3I6G-rhoifolin (MBE = -8.3 kcal/mol) and 5RE4-genistin (MBE = -7.6 kcal/mol) complexes were ranked with the least value. However, the latter showed a greater extent of interactions between the RRs and the ligand atoms and thus was further validated by MD simulation. The MD simulation parameters of the 5RE4-genistin complex over a 100 ns run indicated good structural stability with minimal flexibility within genistin binding pocket region. The findings suggest that S. torvum polyphenols hold good therapeutics potential in COVID-19 management.