An Improved B-hill Climbing Optimization Technique for Solving the Text Documents Clustering Problem

Abualigah LM; Hanandeh ES; Khader AT; Otair MA; Shandilya SK

doi:10.2174/1573405614666180903112541

An Improved B-hill Climbing Optimization Technique for Solving the Text Documents Clustering Problem

Abualigah LM ¹ , Hanandeh ES ² , Khader AT ³ , Otair MA ¹ , Shandilya SK ⁴

Affiliations

¹ Faculty of Computer Sciences and Informatics, Amman Arab University, Amman - 11953, Jordan
² Department of Computer Information System, Zarqa University, P.O. Box 13132, Zarqa, Jordan
³ School of Computer Science, Universiti Sains Malaysia, Penang, Malaysia
⁴ Department of Computer Science & Engineering, NRI Institute of Information Science and Technology, Bhopal, India

Curr Med Imaging, 2020;16(4):296-306.

PMID: 32410533 DOI: 10.2174/1573405614666180903112541

Abstract

BACKGROUND: Considering the increasing volume of text document information on Internet pages, dealing with such a tremendous amount of knowledge becomes totally complex due to its large size. Text clustering is a common optimization problem used to manage a large amount of text information into a subset of comparable and coherent clusters.

AIMS: This paper presents a novel local clustering technique, namely, β-hill climbing, to solve the problem of the text document clustering through modeling the β-hill climbing technique for partitioning the similar documents into the same cluster.

METHODS: The β parameter is the primary innovation in β-hill climbing technique. It has been introduced in order to perform a balance between local and global search. Local search methods are successfully applied to solve the problem of the text document clustering such as; k-medoid and kmean techniques.

RESULTS: Experiments were conducted on eight benchmark standard text datasets with different characteristics taken from the Laboratory of Computational Intelligence (LABIC). The results proved that the proposed β-hill climbing achieved better results in comparison with the original hill climbing technique in solving the text clustering problem.

CONCLUSION: The performance of the text clustering is useful by adding the β operator to the hill climbing.

* Title and MeSH Headings from MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine.

Similar publications