Affiliations 

  • 1 Faculty of Computing, Universiti Teknologi Malaysia, 81310 Skudai, Johor, Malaysia ; Faculty of Computer Sciences and Mathematics, University of Mosul, Mosul, Iraq
  • 2 Faculty of Computing, Universiti Teknologi Malaysia, 81310 Skudai, Johor, Malaysia
  • 3 MIS Department, CBA, Salman Bin Abdulaziz University, Alkharj, Saudi Arabia
  • 4 College of Computer and Information Sciences (CCIS), Prince Sultan University, Riyadh, Saudi Arabia
  • 5 Computer Science Department, College of Computer & Information Sciences, King Saud University, Riyadh, Saudi Arabia
ScientificWorldJournal, 2014;2014:612787.
PMID: 25309952 DOI: 10.1155/2014/612787

Abstract

This paper presents a novel features mining approach from documents that could not be mined via optical character recognition (OCR). By identifying the intimate relationship between the text and graphical components, the proposed technique pulls out the Start, End, and Exact values for each bar. Furthermore, the word 2-gram and Euclidean distance methods are used to accurately detect and determine plagiarism in bar charts.

* Title and MeSH Headings from MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine.