Affiliations 

  • 1 Department of Mathematics, Federal University Gusau, Gusau, Nigeria. ojibidoja@fugusau.edu.ng
  • 2 School of Mathematical Sciences, Universiti Sains Malaysia (USM), 11800, Penang, Malaysia
  • 3 School of Mathematical Sciences, Universiti Sains Malaysia (USM), 11800, Penang, Malaysia. majidkhanmajaharali@usm.my
Sci Rep, 2024 Jul 30;14(1):17599.
PMID: 39080303 DOI: 10.1038/s41598-024-60612-7

Abstract

The linear regression is critical for data modelling, especially for scientists. Nevertheless, with the plenty of high-dimensional data, there are data with more explanatory variables than the number of observations. In such circumstances, traditional approaches fail. This paper proposes a modified sparse regression model that solves the problem of heterogeneity using seaweed big data as a use case. The modified heterogeneity models for ridge, LASSO and Elastic net were used to model the data. Robust estimations M Bi-Square, M Hampel, M Huber, MM and S were used. Based on the results, the hybrid model of sparse regression for before, after, and modified heterogeneity robust regression with the 45 high ranking variables and a 2-sigma limit can be used efficiently and effectively to reduce the outliers. The obtained results confirm that the hybrid model of the modified sparse LASSO with the M Bi-Square estimator for the 45 high ranking parameters performed better compared with other existing methods.

* Title and MeSH Headings from MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine.