Posterior Segment Eye Diseases (PSED) namely Diabetic Retinopathy (DR), glaucoma and Age-related Macular Degeneration (AMD) are the prime causes of vision loss globally. Vision loss can be prevented, if these diseases are detected at an early stage. Structural abnormalities such as changes in cup-to-disc ratio, Hard Exudates (HE), drusen, Microaneurysms (MA), Cotton Wool Spots (CWS), Haemorrhages (HA), Geographic Atrophy (GA) and Choroidal Neovascularization (CNV) in PSED can be identified by manual examination of fundus images by clinicians. However, manual screening is labour-intensive, tiresome and time consuming. Hence, there is a need to automate the eye screening. In this work Bi-dimensional Empirical Mode Decomposition (BEMD) technique is used to decompose fundus images into 2D Intrinsic Mode Functions (IMFs) to capture variations in the pixels due to morphological changes. Further, various entropy namely Renyi, Fuzzy, Shannon, Vajda, Kapur and Yager and energy features are extracted from IMFs. These extracted features are ranked using Chernoff Bound and Bhattacharyya Distance (CBBD), Kullback-Leibler Divergence (KLD), Fuzzy-minimum Redundancy Maximum Relevance (FmRMR), Wilcoxon, Receiver Operating Characteristics Curve (ROC) and t-test methods. Further, these ranked features are fed to Support Vector Machine (SVM) classifier to classify normal and abnormal (DR, AMD and glaucoma) classes. The performance of the proposed eye screening system is evaluated using 800 (Normal=400 and Abnormal=400) digital fundus images and 10-fold cross validation method. Our proposed system automatically identifies normal and abnormal classes with an average accuracy of 88.63%, sensitivity of 86.25% and specificity of 91% using 17 optimal features ranked using CBBD and SVM-Radial Basis Function (RBF) classifier. Moreover, a novel Retinal Risk Index (RRI) is developed using two significant features to distinguish two classes using single number. Such a system helps to reduce eye screening time in polyclinics or community-based mass screening. They will refer the patients to main hospitals only if the diagnosis belong to the abnormal class. Hence, the main hospitals will not be unnecessarily crowded and doctors can devote their time for other urgent cases.
An estimated 6.5 million patients in the United States are affected by chronic wounds, with more than US$25 billion and countless hours spent annually for all aspects of chronic wound care. There is a need for an intelligent software tool to analyze wound images, characterize wound tissue composition, measure wound size, and monitor changes in wound in between visits. Performed manually, this process is very time-consuming and subject to intra- and inter-reader variability. In this work, our objective is to develop methods to segment, measure and characterize clinically presented chronic wounds from photographic images. The first step of our method is to generate a Red-Yellow-Black-White (RYKW) probability map, which then guides the segmentation process using either optimal thresholding or region growing. The red, yellow and black probability maps are designed to handle the granulation, slough and eschar tissues, respectively; while the white probability map is to detect the white label card for measurement calibration purposes. The innovative aspects of this work include defining a four-dimensional probability map specific to wound characteristics, a computationally efficient method to segment wound images utilizing the probability map, and auto-calibration of wound measurements using the content of the image. These methods were applied to 80 wound images, captured in a clinical setting at the Ohio State University Comprehensive Wound Center, with the ground truth independently generated by the consensus of at least two clinicians. While the mean inter-reader agreement between the readers varied between 67.4% and 84.3%, the computer achieved an average accuracy of 75.1%.
Age-related Macular Degeneration (AMD) is one of the major causes of vision loss and blindness in ageing population. Currently, there is no cure for AMD, however early detection and subsequent treatment may prevent the severe vision loss or slow the progression of the disease. AMD can be classified into two types: dry and wet AMDs. The people with macular degeneration are mostly affected by dry AMD. Early symptoms of AMD are formation of drusen and yellow pigmentation. These lesions are identified by manual inspection of fundus images by the ophthalmologists. It is a time consuming, tiresome process, and hence an automated diagnosis of AMD screening tool can aid clinicians in their diagnosis significantly. This study proposes an automated dry AMD detection system using various entropies (Shannon, Kapur, Renyi and Yager), Higher Order Spectra (HOS) bispectra features, Fractional Dimension (FD), and Gabor wavelet features extracted from greyscale fundus images. The features are ranked using t-test, Kullback-Lieber Divergence (KLD), Chernoff Bound and Bhattacharyya Distance (CBBD), Receiver Operating Characteristics (ROC) curve-based and Wilcoxon ranking methods in order to select optimum features and classified into normal and AMD classes using Naive Bayes (NB), k-Nearest Neighbour (k-NN), Probabilistic Neural Network (PNN), Decision Tree (DT) and Support Vector Machine (SVM) classifiers. The performance of the proposed system is evaluated using private (Kasturba Medical Hospital, Manipal, India), Automated Retinal Image Analysis (ARIA) and STructured Analysis of the Retina (STARE) datasets. The proposed system yielded the highest average classification accuracies of 90.19%, 95.07% and 95% with 42, 54 and 38 optimal ranked features using SVM classifier for private, ARIA and STARE datasets respectively. This automated AMD detection system can be used for mass fundus image screening and aid clinicians by making better use of their expertise on selected images that require further examination.
This paper presents a study on gene knockout strategies to identify candidate genes to be knocked out for improving the production of succinic acid in Escherichia coli. Succinic acid is widely used as a precursor for many chemicals, for example production of antibiotics, therapeutic proteins and food. However, the chemical syntheses of succinic acid using the traditional methods usually result in the production that is far below their theoretical maximums. In silico gene knockout strategies are commonly implemented to delete the gene in E. coli to overcome this problem. In this paper, a hybrid of Ant Colony Optimization (ACO) and Minimization of Metabolic Adjustment (MoMA) is proposed to identify gene knockout strategies to improve the production of succinic acid in E. coli. As a result, the hybrid algorithm generated a list of knockout genes, succinic acid production rate and growth rate for E. coli after gene knockout. The results of the hybrid algorithm were compared with the previous methods, OptKnock and MOMAKnock. It was found that the hybrid algorithm performed better than OptKnock and MOMAKnock in terms of the production rate. The information from the results produced from the hybrid algorithm can be used in wet laboratory experiments to increase the production of succinic acid in E. coli.
The Electrocardiogram (ECG) is the P-QRS-T wave depicting the cardiac activity of the heart. The subtle changes in the electric potential patterns of repolarization and depolarization are indicative of the disease afflicting the patient. These clinical time domain features of the ECG waveform can be used in cardiac health diagnosis. Due to the presence of noise and minute morphological parameter values, it is very difficult to identify the ECG classes accurately by the naked eye. Various computer aided cardiac diagnosis (CACD) systems, analysis methods, challenges addressed and the future of cardiovascular disease screening are reviewed in this paper. Methods developed for time domain, frequency transform domain, and time-frequency domain analysis, such as the wavelet transform, cannot by themselves represent the inherent distinguishing features accurately. Hence, nonlinear methods which can capture the small variations in the ECG signal and provide improved accuracy in the presence of noise are discussed in greater detail in this review. A CACD system exploiting these nonlinear features can help clinicians to diagnose cardiovascular disease more accurately.
Many biological research areas such as drug design require gene regulatory networks to provide clear insight and understanding of the cellular process in living cells. This is because interactions among the genes and their products play an important role in many molecular processes. A gene regulatory network can act as a blueprint for the researchers to observe the relationships among genes. Due to its importance, several computational approaches have been proposed to infer gene regulatory networks from gene expression data. In this review, six inference approaches are discussed: Boolean network, probabilistic Boolean network, ordinary differential equation, neural network, Bayesian network, and dynamic Bayesian network. These approaches are discussed in terms of introduction, methodology and recent applications of these approaches in gene regulatory network construction. These approaches are also compared in the discussion section. Furthermore, the strengths and weaknesses of these computational approaches are described.
A computer-aided detection auto-probing (CADAP) system is presented for detecting breast lesions using dynamic contrast enhanced magnetic resonance imaging, through a spatial-based discrete Fourier transform. The stand-alone CADAP system reduces noise, refines region of interest (ROI) automatically, and detects the breast lesion with minimal false positive detection. The lesions are then classified and colourised according to their characteristics, whether benign, suspicious or malignant. To enhance the visualisation, the entire analysed ROI is constructed into a 3-D image, so that the user can diagnose based on multiple views on the ROI. The proposed method has been applied to 101 sets of digital images, and the results compared with the biopsy results done by radiologists. The proposed scheme is able to identify breast cancer regions accurately and efficiently.
Diabetes mellitus (DM) affects considerable number of people in the world and the number of cases is increasing every year. Due to a strong link to the genetic basis of the disease, it is extremely difficult to cure. However, it can be controlled to prevent severe consequences, such as organ damage. Therefore, diabetes diagnosis and monitoring of its treatment is very important. In this paper, we have proposed a non-invasive diagnosis support system for DM. The system determines whether or not diabetes is present by determining the cardiac health of a patient using heart rate variability (HRV) analysis. This analysis was based on nine nonlinear features namely: Approximate Entropy (ApEn), largest Lyapunov exponet (LLE), detrended fluctuation analysis (DFA) and recurrence quantification analysis (RQA). Clinically significant measures were used as input to classification algorithms, namely AdaBoost, decision tree (DT), fuzzy Sugeno classifier (FSC), k-nearest neighbor algorithm (k-NN), probabilistic neural network (PNN) and support vector machine (SVM). Ten-fold stratified cross-validation was used to select the best classifier. AdaBoost, with least squares (LS) as weak learner, performed better than the other classifiers, yielding an average accuracy of 90%, sensitivity of 92.5% and specificity of 88.7%.
Wavelet packet transform decomposes a signal into a set of orthonormal bases (nodes) and provides opportunities to select an appropriate set of these bases for feature extraction. In this paper, multi-level basis selection (MLBS) is proposed to preserve the most informative bases of a wavelet packet decomposition tree through removing less informative bases by applying three exclusion criteria: frequency range, noise frequency, and energy threshold. MLBS achieved an accuracy of 97.56% for classifying normal heart sound, aortic stenosis, mitral regurgitation, and aortic regurgitation. MLBS is a promising basis selection to be suggested for signals with a small range of frequencies.
A drastic improvement in the analysis of gene expression has lead to new discoveries in bioinformatics research. In order to analyse the gene expression data, fuzzy clustering algorithms are widely used. However, the resulting analyses from these specific types of algorithms may lead to confusion in hypotheses with regard to the suggestion of dominant function for genes of interest. Besides that, the current fuzzy clustering algorithms do not conduct a thorough analysis of genes with low membership values. Therefore, we present a novel computational framework called the "multi-stage filtering-Clustering Functional Annotation" (msf-CluFA) for clustering gene expression data. The framework consists of four components: fuzzy c-means clustering (msf-CluFA-0), achieving dominant cluster (msf-CluFA-1), improving confidence level (msf-CluFA-2) and combination of msf-CluFA-0, msf-CluFA-1 and msf-CluFA-2 (msf-CluFA-3). By employing double filtering in msf-CluFA-1 and apriori algorithms in msf-CluFA-2, our new framework is capable of determining the dominant clusters and improving the confidence level of genes with lower membership values by means of which the unknown genes can be predicted.
The structural comparison of proteins is a vital step in structural biology that is used to predict and analyse a new unknown protein function. Although a number of different techniques have been explored, the study to develop new alternative methods is still an active research area. The present paper introduces a text modelling-based technique for the structural comparison of proteins. The method models the secondary and tertiary structure of proteins in two linear sequences and then applies them to the comparison of two structures. The technique used for pairwise comparison of the sequences has been adopted from computational linguistics and its well-known techniques for analysing and quantifying textual sequences. To this end, an n-gram modelling technique is used to capture regularities between sequences, and then, the cross-entropy concept is employed to measure their similarities. Several experiments are conducted to evaluate the performance of the method and compare it with other commonly used programs. The assessments for information retrieval evaluation demonstrate that the technique has a high running speed, which is similar to other linear encoding methods, such as 3D-BLAST, SARST, and TS-AMIR, whereas its accuracy is comparable to CE and TM-align, which are high accuracy comparison tools. Accordingly, the results demonstrate that the algorithm has high efficiency compared with other state-of-the-art methods.
Psoriasis is an incurable skin disorder affecting 2-3% of the world population. The scaliness of psoriasis is a key assessment parameter of the Psoriasis Area and Severity Index (PASI). Dermatologists typically use visual and tactile senses in PASI scaliness assessment. However, the assessment can be subjective resulting in inter- and intra-rater variability in the scores. This paper proposes an assessment method that incorporates 3D surface roughness with standard clustering techniques to objectively determine the PASI scaliness score for psoriasis lesions. A surface roughness algorithm using structured light projection has been applied to 1999 3D psoriasis lesion surfaces. The algorithm has been validated with an accuracy of 94.12%. Clustering algorithms were used to classify the surface roughness measured using the proposed assessment method for PASI scaliness scoring. The reliability of the developed PASI scaliness algorithm was high with kappa coefficients>0.84 (almost perfect agreement).
Remote protein homology detection and fold recognition refer to detection of structural homology in proteins where there are small or no similarities in the sequence. To detect protein structural classes from protein primary sequence information, homology-based methods have been developed, which can be divided to three types: discriminative classifiers, generative models for protein families and pairwise sequence comparisons. Support Vector Machines (SVM) and Neural Networks (NN) are two popular discriminative methods. Recent studies have shown that SVM has fast speed during training, more accurate and efficient compared to NN. We present a comprehensive method based on two-layer classifiers. The 1st layer is used to detect up to superfamily and family in SCOP hierarchy using optimized binary SVM classification rules. It used the kernel function known as the Bio-kernel, which incorporates the biological information in the classification process. The 2nd layer uses discriminative SVM algorithm with string kernel that will detect up to protein fold level in SCOP hierarchy. The results obtained were evaluated using mean ROC and mean MRFP and the significance of the result produced with pairwise t-test was tested. Experimental results show that our approaches significantly improve the performance of remote protein homology detection and fold recognition for all three different version SCOP datasets (1.53, 1.67 and 1.73). We achieved 4.19% improvements in term of mean ROC in SCOP 1.53, 4.75% in SCOP 1.67 and 4.03% in SCOP 1.73 datasets when compared to the result produced by well-known methods. The combination of first layer and second layer of BioSVM-2L performs well in remote homology detection and fold recognition even in three different versions of datasets.
This paper introduces an approach to perform segmentation of regions in computed tomography (CT) images that exhibit intra-region intensity variations and at the same time have similar intensity distributions with surrounding/adjacent regions. In this work, we adapt a feature computed from wavelet transform called wavelet energy to represent the region information. The wavelet energy is embedded into a level set model to formulate the segmentation model called wavelet energy-guided level set-based active contour (WELSAC). The WELSAC model is evaluated using several synthetic and CT images focusing on tumour cases, which contain regions demonstrating the characteristics of intra-region intensity variations and having high similarity in intensity distributions with the adjacent regions. The obtained results show that the proposed WELSAC model is able to segment regions of interest in close correspondence with the manual delineation provided by the medical experts and to provide a solution for tumour detection.
Monitoring FAZ area enlargement enables physicians to monitor progression of the DR. At present, it is difficult to discern the FAZ area and to measure its enlargement in an objective manner using digital fundus images. A semi-automated approach for determination of FAZ using color images has been developed. Here, a binary map of retinal blood vessels is computer generated from the digital fundus image to determine vessel ends and pathologies surrounding FAZ for area analysis. The proposed method is found to achieve accuracies from 66.67% to 98.69% compared to accuracies of 18.13-95.07% obtained by manual segmentation of FAZ regions from digital fundus images.
Protein-protein interactions (PPIs) play a significant role in many crucial cellular operations such as metabolism, signaling and regulations. The computational methods for predicting PPIs have shown tremendous growth in recent years, but problem such as huge false positive rates has contributed to the lack of solid PPI information. We aimed at enhancing the overlap between computational predictions and experimental results in an effort to partially remove PPIs falsely predicted. The use of protein function predictor named PFP() that are based on shared interacting domain patterns is introduced in this study with the purpose of aiding the Gene Ontology Annotations (GOA). We used GOA and PFP() as agents in a filtering process to reduce false positive pairs in the computationally predicted PPI datasets. The functions predicted by PFP() were extracted from cross-species PPI data in order to assign novel functional annotations for the uncharacterized proteins and also as additional functions for those that are already characterized by the GO (Gene Ontology). The implementation of PFP() managed to increase the chances of finding matching function annotation for the first rule in the filtration process as much as 20%. To assess the capability of the proposed framework in filtering false PPIs, we applied it on the available S. cerevisiae PPIs and measured the performance in two aspects, the improvement made indicated as Signal-to-Noise Ratio (SNR) and the strength of improvement, respectively. The proposed filtering framework significantly achieved better performance than without it in both metrics.
The use of vascular intersection aberration as one of the signs when monitoring and diagnosing diabetic retinopathy from retina fundus images (FIs) has been widely reported in the literature. In this paper, a new hybrid approach called the combined cross-point number (CCN) method able to detect the vascular bifurcation and intersection points in FIs is proposed. The CCN method makes use of two vascular intersection detection techniques, namely the modified cross-point number (MCN) method and the simple cross-point number (SCN) method. Our proposed approach was tested on images obtained from two different and publicly available fundus image databases. The results show a very high precision, accuracy, sensitivity and low false rate in detecting both bifurcation and crossover points compared with both the MCN and the SCN methods.
Protein domains contain information about the prediction of protein structure, function, evolution and design since the protein sequence may contain several domains with different or the same copies of the protein domain. In this study, we proposed an algorithm named SplitSSI-SVM that works with the following steps. First, the training and testing datasets are generated to test the SplitSSI-SVM. Second, the protein sequence is split into subsequence based on order and disorder regions. The protein sequence that is more than 600 residues is split into subsequences to investigate the effectiveness of the protein domain prediction based on subsequence. Third, multiple sequence alignment is performed to predict the secondary structure using bidirectional recurrent neural networks (BRNN) where BRNN considers the interaction between amino acids. The information of about protein secondary structure is used to increase the protein domain boundaries signal. Lastly, support vector machines (SVM) are used to classify the protein domain into single-domain, two-domain and multiple-domain. The SplitSSI-SVM is developed to reduce misleading signal, lower protein domain signal caused by primary structure of protein sequence and to provide accurate classification of the protein domain. The performance of SplitSSI-SVM is evaluated using sensitivity and specificity on single-domain, two-domain and multiple-domain. The evaluation shows that the SplitSSI-SVM achieved better results compared with other protein domain predictors such as DOMpro, GlobPlot, Dompred-DPS, Mateo, Biozon, Armadillo, KemaDom, SBASE, HMMPfam and HMMSMART especially in two-domain and multiple-domain.
Flow of an electrically conducting fluid characterizing blood through the arteries having irregular shaped multi-stenoses in the environment of a uniform transverse magnetic-field is analysed. The flow is considered to be axisymmetric with an outline of the irregular stenoses obtained from a three-dimensional casting of a mild stenosed artery, so that the physical problem becomes more realistic from the physiological point of view. The marker and cell (MAC) and successive-over-relaxation (SOR) methods are respectively used to solve the governing unsteady magnetohydrodynamic (MHD) equations and pressure-Poisson equation quantitatively and to observe the flow separation. The results obtained show that the flow separates mostly towards the downstream of the multi-stenoses. However, the flow separation region keeps on shrinking with the increasing intensity of the magnetic-field which completely disappears with sufficiently large value of the Hartmann number. The present observations certainly have some clinical implications relating to magnetotherapy which help reducing the complex flow separation zones causing flow disorder leading to the formation and progression of the arterial diseases.
This paper presents a comparative study between wavelet and curvelet transform for breast cancer diagnosis in digital mammogram. Using multiresolution analysis, mammogram images are decomposed into different resolution levels, which are sensitive to different frequency bands. A set of the biggest coefficients from each decomposition level is extracted. Then a supervised classifier system based on Euclidian distance is constructed. The performance of the classifier is evaluated using a 2 x 5-fold cross validation followed by a statistical analysis. The experimental results suggest that curvelet transform outperforms wavelet transform and the difference is statistically significant.