One of the most important goals in bioinformatics is the ability to predict tertiary structure of a protein from its amino acid sequence. In this paper, new feature groups based on the physical and physicochemical properties of amino acids (size of the amino acids' side chains, predicted secondary structure based on normalized frequency of β-Strands, Turns, and Reverse Turns) are proposed to tackle this task. The proposed features are extracted using a modified feature extraction method adapted from Dubchak et al. To study the effectiveness of the proposed features and the modified feature extraction method, AdaBoost.M1, Multi Layer Perceptron (MLP), and Support Vector Machine (SVM) that have been commonly and successfully applied to the protein folding problem are employed. Our experimental results show that the new feature groups altogether with the modified feature extraction method are capable of enhancing the protein fold prediction accuracy better than the previous works found in the literature.
* Title and MeSH Headings from MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine.