This research employs the gradient descent learning (FIR.DM) approach as a learning process in a nonlinear spectral model of maximum overlapping discrete wavelet transform (MODWT) to improve volatility prediction of daily stock market prices using Saudi Arabia's stock exchange (Tadawul) data. The MODWT comprises five mathematical functions and fuzzy inference rules. The inputs are the oil price (Loil) and repo rate (Repo) according to multiple regression correlation, and the Engle and Granger Causality test Engle RF, (1987). The logarithm of the stock market price (LSCS) in Tadawul reflects the output variable. The correlation matrix reveals that there is no collinearity between the input variables, and the causality test demonstrates that the input variables significantly influence the outcome variable. According to the multiple regression, there is a substantial negative influence between Loil and LSCS but a significant positive effect between Repo and output. For the 80% dataset under ME (0.000005), MAE (0.003214), and MAPE (0.064497), the MODWT-LA8 (ARIMA(1,1,0) with drift) for the LSCS variable performs better than other WT functions. In the novel hybrid model MODWT-FIR.DM, each function's approximation coefficient (LSCS) is applied with input variables (Loil and Repo). We evaluate the performance of the proposed model (MODWT-LA8-FIR.DM) using different statistical measures (ME, RMSE, MAE, MPE) and compare it to two established models: the original FIR.DM and other MODWT-FIR.DM functions for forecasting 20% of datasets. The outcomes show that the MODWT-LA8-FIR.DM performs better than the traditional models based on lower ME (3.167586), RMSE (3.167638), MAE (3.167586), and MPE (80.860849). The proposed hybrid model may be a potential stock market forecasting model.
Specialized data preparation techniques, ranging from data cleaning, outlier detection, missing value imputation, feature selection (FS), amongst others, are procedures required to get the most out of data and, consequently, get the optimal performance of predictive models for classification tasks. FS is a vital and indispensable technique that enables the model to perform faster, eliminate noisy data, remove redundancy, reduce overfitting, improve precision and increase generalization on testing data. While conventional FS techniques have been leveraged for classification tasks in the past few decades, they fail to optimally reduce the high dimensionality of the feature space of texts, thus breeding inefficient predictive models. Emerging technologies such as the metaheuristics and hyper-heuristics optimization methods provide a new paradigm for FS due to their efficiency in improving the accuracy of classification, computational demands, storage, as well as functioning seamlessly in solving complex optimization problems with less time. However, little details are known on best practices for case-to-case usage of emerging FS methods. The literature continues to be engulfed with clear and unclear findings in leveraging effective methods, which, if not performed accurately, alters precision, real-world-use feasibility, and the predictive model's overall performance. This paper reviews the present state of FS with respect to metaheuristics and hyper-heuristic methods. Through a systematic literature review of over 200 articles, we set out the most recent findings and trends to enlighten analysts, practitioners and researchers in the field of data analytics seeking clarity in understanding and implementing effective FS optimization methods for improved text classification tasks.