Affiliations 

  • 1 Soft Computing and Data Mining Center, Faculty of Computer Sciences and Information Technology, Universiti Tun Hussein Onn Malaysia (UTHM), 83000, Batu Pahat, Malaysia
  • 2 Soft Computing and Data Mining Center, Faculty of Computer Sciences and Information Technology, Universiti Tun Hussein Onn Malaysia (UTHM), 83000, Batu Pahat, Malaysia. shahreen@uthm.edu.my
  • 3 Computer and Information Sciences Department (CISD), Universiti Teknologi PETRONAS (UTP), 32610, Seri Iskandar, Malaysia
  • 4 Department of Electrical Engineering, Universitas Ahmad Dahlan (UAD), 55166, Yogyakarta, Indonesia
  • 5 Faculty of Electrical Engineering, Universiti Teknologi Malaysia (UTM), 81310, Johor Bahru, Malaysia
  • 6 Faculty of Computing, Universiti Teknologi Malaysia (UTM), 81310, Johor Bahru, Malaysia
BMC Bioinformatics, 2024 Jan 12;25(1):23.
PMID: 38216898 DOI: 10.1186/s12859-024-05632-w

Abstract

BACKGROUND: With the exponential growth of high-throughput technologies, multiple pathway analysis methods have been proposed to estimate pathway activities from gene expression profiles. These pathway activity inference methods can be divided into two main categories: non-Topology-Based (non-TB) and Pathway Topology-Based (PTB) methods. Although some review and survey articles discussed the topic from different aspects, there is a lack of systematic assessment and comparisons on the robustness of these approaches.

RESULTS: Thus, this study presents comprehensive robustness evaluations of seven widely used pathway activity inference methods using six cancer datasets based on two assessments. The first assessment seeks to investigate the robustness of pathway activity in pathway activity inference methods, while the second assessment aims to assess the robustness of risk-active pathways and genes predicted by these methods. The mean reproducibility power and total number of identified informative pathways and genes were evaluated. Based on the first assessment, the mean reproducibility power of pathway activity inference methods generally decreased as the number of pathway selections increased. Entropy-based Directed Random Walk (e-DRW) distinctly outperformed other methods in exhibiting the greatest reproducibility power across all cancer datasets. On the other hand, the second assessment shows that no methods provide satisfactory results across datasets.

CONCLUSION: However, PTB methods generally appear to perform better in producing greater reproducibility power and identifying potential cancer markers compared to non-TB methods.

* Title and MeSH Headings from MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine.