Sentiment analysis classification has been typically performed by combining features that represent the dataset at hand. Existing works have employed various features individually such as the syntactical, lexical and machine learning, and some have hybridized to reach optimistic results. Since the debate on the best combination is still unresolved this paper addresses the empirical investigation of the combination of features for product review classification. Results indicate the Support Vector Machine classification model combined with any of the observed lexicon namely MPQA, BingLiu and General Inquirer and either the unigram or inte-gration of unigram and bigram features is the top performer.
K-Means is an unsupervised method partitions the input space into clusters. K-Means algorithm has a weakness of detecting outliers, which have it available in many variations research fields. A decade ago, Rough Sets Theory (RST) has been used to solve the problem of clustering partition. Specifically, Rough K-Means (RKM) is a one of the powerful hybrid algorithm, which has it, has various extension versions. However, with respect of the ideas of existing rough clustering algorithms, a suitable method to detect outliers is much needed now. In this paper, we propose an effective method to detect local outliers in rough clustering. The Local Outlier Factor (LOF) method in rough clustering improves the quality of the cluster partition. The improved algorithm increased the level of clusters quality. An existing algorithm version, the π Rough K-Means (π RKM) tested in the study. Finally, the effectiveness of the algorithm performance is demonstrated based on synthetic and real datasets.