This paper presents an approach for breast cancer diagnosis in digital mammogram using curvelet transform. After decomposing the mammogram images in curvelet basis, a special set of the biggest coefficients is extracted as feature vector. The Euclidean distance is then used to construct a supervised classifier. The experimental results gave a 98.59% classification accuracy rate, which indicate that curvelet transformation is a promising tool for analysis and classification of digital mammograms.
Bioinformatics has been an emerging area of research for the last three decades. The ultimate aims of bioinformatics were to store and manage the biological data, and develop and analyze computational tools to enhance their understanding. The size of data accumulated under various sequencing projects is increasing exponentially, which presents difficulties for the experimental methods. To reduce the gap between newly sequenced protein and proteins with known functions, many computational techniques involving classification and clustering algorithms were proposed in the past. The classification of protein sequences into existing superfamilies is helpful in predicting the structure and function of large amount of newly discovered proteins. The existing classification results are unsatisfactory due to a huge size of features obtained through various feature encoding methods. In this work, a statistical metric-based feature selection technique has been proposed in order to reduce the size of the extracted feature vector. The proposed method of protein classification shows significant improvement in terms of performance measure metrics: accuracy, sensitivity, specificity, recall, F-measure, and so forth.
Two novel and highly accurate hybrid models were developed for the prediction of the flammability limits (lower flammability limit (LFL) and upper flammability limit (UFL)) of pure compounds using a quantitative structure-property relationship approach. The two models were developed using a dataset obtained from the DIPPR Project 801 database, which comprises 1057 and 515 literature data for the LFL and UFL, respectively. Multiple linear regression (MLR), logarithmic, and polynomial models were used to develop the models according to an algorithm and code written using the MATLAB software. The results indicated that the proposed models were capable of predicting LFL and UFL values with accuracies that were among the best (i.e. most optimised) reported in the literature (LFL: R2 = 99.72%, with an average absolute relative deviation (AARD) of 0.8%; UFL: R2 = 99.64%, with an AARD of 1.41%). These hybrid models are unique in that they were developed using a modified mathematical technique combined three conventional methods. These models afford good practicability and can be used as cost-effective alternatives to experimental measurements of LFL and UFL values for a wide range of pure compounds.
A new mathematical model has been developed that expresses the toxicities (EC₅₀ values) of a wide variety of ionic liquids (ILs) towards the freshwater flea Daphnia magna by means of a quantitative structure-activity relationship (QSAR). The data were analyzed using summed contributions from the cations, their alkyl substituents and anions. The model employed multiple linear regression analysis with polynomial model using the MATLAB software. The model predicted IL toxicities with R²=0.974 and standard error of estimate of 0.028. This model affords a practical, cost-effective and convenient alternative to experimental ecotoxicological assessment of many ILs.