MyMedR

Displaying publications 1 - 20 of 266 in total

Abstract:

Sort:

Fulltext Crude oil price forecasting based on hybridizing wavelet multiple linear regression model, particle swarm optimization techniques, and principal component analysis

Shabri A, Samsudin R

ScientificWorldJournal, 2014;2014:854520.
PMID: 24895666 DOI: 10.1155/2014/854520

Crude oil prices do play significant role in the global economy and are a key input into option pricing formulas, portfolio allocation, and risk measurement. In this paper, a hybrid model integrating wavelet and multiple linear regressions (MLR) is proposed for crude oil price forecasting. In this model, Mallat wavelet transform is first selected to decompose an original time series into several subseries with different scale. Then, the principal component analysis (PCA) is used in processing subseries data in MLR for crude oil price forecasting. The particle swarm optimization (PSO) is used to adopt the optimal parameters of the MLR model. To assess the effectiveness of this model, daily crude oil market, West Texas Intermediate (WTI), has been used as the case study. Time series prediction capability performance of the WMLR model is compared with the MLR, ARIMA, and GARCH models using various statistics measures. The experimental results show that the proposed model outperforms the individual models in forecasting of the crude oil prices series.

Matched MeSH terms: Principal Component Analysis/methods*
New discrimination procedure of location model for handling large categorical variables

Hashibah Hamid, Long MM, Sharipah Soaad Syed Yahaya

Sains Malaysiana, 2017;46:1001-1010.

The location model proposed in the past is a predictive discriminant rule that can classify new observations into one
of two predefined groups based on mixtures of continuous and categorical variables. The ability of location model to
discriminate new observation correctly is highly dependent on the number of multinomial cells created by the number
of categorical variables. This study conducts a preliminary investigation to show the location model that uses maximum
likelihood estimation has high misclassification rate up to 45% on average in dealing with more than six categorical
variables for all 36 data tested. Such model indicated highly incorrect prediction as this model performed badly for
large categorical variables even with large sample size. To alleviate the high rate of misclassification, a new strategy
is embedded in the discriminant rule by introducing nonlinear principal component analysis (NPCA) into the classical
location model (cLM), mainly to handle the large number of categorical variables. This new strategy is investigated
on some simulation and real datasets through the estimation of misclassification rate using leave-one-out method. The
results from numerical investigations manifest the feasibility of the proposed model as the misclassification rate is
dramatically decreased compared to the cLM for all 18 different data settings. A practical application using real dataset
demonstrates a significant improvement and obtains comparable result among the best methods that are compared. The
overall findings reveal that the proposed model extended the applicability range of the location model as previously it
was limited to only six categorical variables to achieve acceptable performance. This study proved that the proposed
model with new discrimination procedure can be used as an alternative to the problems of mixed variables classification,
primarily when facing with large categorical variables.

Matched MeSH terms: Principal Component Analysis
Evaluation of dissolved heavy metals in water of the Sungai Semenyih (peninsular Malaysia) using environmetric methods

Fawaz Al-badaii, Azhar Abdul Halim, Mohammad Shuhaimi-othman

Sains Malaysiana, 2016;45:841-852.

The study to determine the concentrations of dissolved heavy metals in the Sungai Semenyih and to use the environmetric
methods to evaluate the influence of different pollution sources on heavy metals concentrations was carried out. Cluster
analysis (CA) classified 8 sampling stations into two clusters based on the similarity of sampling stations characteristics,
cluster 1 included stations 1, 2, 3 and 4 (low pollution area), whereas cluster 2 comprised of stations 5, 6, 7 and 8
(high pollution area). Principal component analysis (PCA) of the two datasets yield two factors for low pollution area
and three factors for the high pollution area at Eigenvalues >1, representing 92.544% and 100% of the total variance
in each heavy metals data sets and allowed to gather selected heavy metals based on the anthropogenic and lithologic
sources of contamination.

Matched MeSH terms: Principal Component Analysis
A simplified clustering method for novice narcotic chemists

Chan KW, Tan GH, Wong RC

Sci Justice, 2012 Sep;52(3):136-41.
PMID: 22841136 DOI: 10.1016/j.scijus.2012.04.006

Statistical classification remains the most useful statistical tool for forensic chemists to assess the relationships between samples. Many clustering techniques such as principal component analysis and hierarchical cluster analysis have been employed to analyze chemical data for pattern recognition. Due to the feeble foundation of this statistics knowledge among novice drug chemists, a tetrahedron method was designed to simulate how advanced chemometrics operates. In this paper, the development of the graphical tetrahedron and computational matrices derived from the possible tetrahedrons are discussed. The tetrahedron method was applied to four selected parameters obtained from nine illicit heroin samples. Pattern analysis and mathematical computation of the differences in areas for assessing the dissimilarity between the nine tetrahedrons were found to be user-convenient and straightforward for novice cluster analysts.

Matched MeSH terms: Principal Component Analysis
Water quality modelling using principal component analysis and artificial neural network

Ibrahim A, Ismail A, Juahir H, Iliyasu AB, Wailare BT, Mukhtar M, et al.

Mar Pollut Bull, 2023 Feb;187:114493.
PMID: 36566515 DOI: 10.1016/j.marpolbul.2022.114493

The study investigates the latent pollution sources and most significant parameters that cause spatial variation and develops the best input for water quality modelling using principal component analysis (PCA) and artificial neural network (ANN). The dataset, 22 water quality parameters were obtained from Department of Environment Malaysia (DOE). The PCA generated six significant principal component scores (PCs) which explained 65.40 % of the total variance. Parameters for water quality variation are mainlyrelated to mineral components, anthropogenic activities, and natural processes. However, in ANN three input combination models (ANN A, B, and C) were developed to identify the best model that can predict water quality index (WQI) with very high precision. ANN A model appears to have the best prediction capacity with a coefficient of determination (R2) = 0.9999 and root mean square error (RMSE) = 0.0537. These results proved that the PCA and ANN methods can be applied as tools for decision-making and problem-solving for better managing of river quality.

Matched MeSH terms: Principal Component Analysis
A morphometric approach to morphology analysis of palatal rugae in sibling groups

Tey SN, Syed Mohamed AMF, Marizan Nor M

J Forensic Sci, 2024 Jan;69(1):189-198.
PMID: 37706423 DOI: 10.1111/1556-4029.15380

Recent advances in imaging technologies, such as intra-oral surface scanning, have rapidly generated large datasets of high-resolution three-dimensional (3D) sample reconstructions. These datasets contain a wealth of phenotypic information that can provide an understanding of morphological variation and evolution. The geometric morphometric method (GMM) with landmarks and the development of sliding and surface semilandmark techniques has greatly enhanced the quantification of shape. This study aimed to determine whether there are significant differences in 3D palatal rugae shape between siblings. Digital casts representing 25 pairs of full siblings from each group, male-male (MM), female-female (FF), and female-male (FM), were digitized and transferred to a GM system. The palatal rugae were determined, quantified, and visualized using GMM computational tools with MorphoJ software (University of Manchester). Principal component analysis (PCA) and canonical variates analysis (CVA) were employed to analyze palatal rugae shape variability and distinguish between sibling groups based on shape. Additionally, regression analysis examined the potential impact of shape on palatal rugae. The study revealed that the palatal rugae shape covered the first nine of the PCA by 71.3%. In addition, the size of the palatal rugae has a negligible impact on its shape. Whilst palatal rugae are known for their individuality, it is noteworthy that three palatal rugae (right first, right second, and left third) can differentiate sibling groups, which may be attributed to genetics. Therefore, it is suggested that palatal rugae morphology can serve as forensic identification for siblings.

Matched MeSH terms: Principal Component Analysis
Fulltext Forecasting East Asian Indices Futures via a Novel Hybrid of Wavelet-PCA Denoising and Artificial Neural Network Models

Chan Phooi M'ng J, Mehralizadeh M

PLoS One, 2016;11(6):e0156338.
PMID: 27248692 DOI: 10.1371/journal.pone.0156338

The motivation behind this research is to innovatively combine new methods like wavelet, principal component analysis (PCA), and artificial neural network (ANN) approaches to analyze trade in today's increasingly difficult and volatile financial futures markets. The main focus of this study is to facilitate forecasting by using an enhanced denoising process on market data, taken as a multivariate signal, in order to deduct the same noise from the open-high-low-close signal of a market. This research offers evidence on the predictive ability and the profitability of abnormal returns of a new hybrid forecasting model using Wavelet-PCA denoising and ANN (named WPCA-NN) on futures contracts of Hong Kong's Hang Seng futures, Japan's NIKKEI 225 futures, Singapore's MSCI futures, South Korea's KOSPI 200 futures, and Taiwan's TAIEX futures from 2005 to 2014. Using a host of technical analysis indicators consisting of RSI, MACD, MACD Signal, Stochastic Fast %K, Stochastic Slow %K, Stochastic %D, and Ultimate Oscillator, empirical results show that the annual mean returns of WPCA-NN are more than the threshold buy-and-hold for the validation, test, and evaluation periods; this is inconsistent with the traditional random walk hypothesis, which insists that mechanical rules cannot outperform the threshold buy-and-hold. The findings, however, are consistent with literature that advocates technical analysis.

Matched MeSH terms: Principal Component Analysis
Fulltext Neural network and principal component regression in non-destructive soluble solids content assessment: a comparison

Chia KS, Abdul Rahim H, Abdul Rahim R

J Zhejiang Univ Sci B, 2012 Feb;13(2):145-51.
PMID: 22302428 DOI: 10.1631/jzus.B11c0150

Visible and near infrared spectroscopy is a non-destructive, green, and rapid technology that can be utilized to estimate the components of interest without conditioning it, as compared with classical analytical methods. The objective of this paper is to compare the performance of artificial neural network (ANN) (a nonlinear model) and principal component regression (PCR) (a linear model) based on visible and shortwave near infrared (VIS-SWNIR) (400-1000 nm) spectra in the non-destructive soluble solids content measurement of an apple. First, we used multiplicative scattering correction to pre-process the spectral data. Second, PCR was applied to estimate the optimal number of input variables. Third, the input variables with an optimal amount were used as the inputs of both multiple linear regression and ANN models. The initial weights and the number of hidden neurons were adjusted to optimize the performance of ANN. Findings suggest that the predictive performance of ANN with two hidden neurons outperforms that of PCR.

Matched MeSH terms: Principal Component Analysis*
Performance of Combined Support Vector Machine and Principal Component Analysis in recognizing infant cry with asphyxia

Sahak R, Mansor W, Lee YK, Yassin AM, Zabidi A

Annu Int Conf IEEE Eng Med Biol Soc, 2010;2010:6292-5.
PMID: 21097359 DOI: 10.1109/IEMBS.2010.5628084

Combined Support Vector Machine (SVM) and Principal Component Analysis (PCA) was used to recognize the infant cries with asphyxia. SVM classifier based on features selected by the PCA was trained to differentiate between pathological and healthy cries. The PCA was applied to reduce dimensionality of the vectors that serve as inputs to the SVM. The performance of the SVM utilizing linear and RBF kernel was examined. Experimental results showed that SVM with RBF kernel yields good performance. The classification accuracy in classifying infant cry with asphyxia using the SVM-PCA is 95.86%.

Matched MeSH terms: Principal Component Analysis/methods*
Fulltext Improved classification of Orthosiphon stamineus by data fusion of electronic nose and tongue sensors

Zakaria A, Shakaff AY, Adom AH, Ahmad MN, Masnan MJ, Aziz AH, et al.

Sensors (Basel), 2010;10(10):8782-96.
PMID: 22163381 DOI: 10.3390/s101008782

An improved classification of Orthosiphon stamineus using a data fusion technique is presented. Five different commercial sources along with freshly prepared samples were discriminated using an electronic nose (e-nose) and an electronic tongue (e-tongue). Samples from the different commercial brands were evaluated by the e-tongue and then followed by the e-nose. Applying Principal Component Analysis (PCA) separately on the respective e-tongue and e-nose data, only five distinct groups were projected. However, by employing a low level data fusion technique, six distinct groupings were achieved. Hence, this technique can enhance the ability of PCA to analyze the complex samples of Orthosiphon stamineus. Linear Discriminant Analysis (LDA) was then used to further validate and classify the samples. It was found that the LDA performance was also improved when the responses from the e-nose and e-tongue were fused together.

Matched MeSH terms: Principal Component Analysis/methods
Fulltext A hybrid color space for skin detection using genetic algorithm heuristic search and principal component analysis technique

Maktabdar Oghaz M, Maarof MA, Zainal A, Rohani MF, Yaghoubyan SH

PLoS One, 2015;10(8):e0134828.
PMID: 26267377 DOI: 10.1371/journal.pone.0134828

Color is one of the most prominent features of an image and used in many skin and face detection applications. Color space transformation is widely used by researchers to improve face and skin detection performance. Despite the substantial research efforts in this area, choosing a proper color space in terms of skin and face classification performance which can address issues like illumination variations, various camera characteristics and diversity in skin color tones has remained an open issue. This research proposes a new three-dimensional hybrid color space termed SKN by employing the Genetic Algorithm heuristic and Principal Component Analysis to find the optimal representation of human skin color in over seventeen existing color spaces. Genetic Algorithm heuristic is used to find the optimal color component combination setup in terms of skin detection accuracy while the Principal Component Analysis projects the optimal Genetic Algorithm solution to a less complex dimension. Pixel wise skin detection was used to evaluate the performance of the proposed color space. We have employed four classifiers including Random Forest, Naïve Bayes, Support Vector Machine and Multilayer Perceptron in order to generate the human skin color predictive model. The proposed color space was compared to some existing color spaces and shows superior results in terms of pixel-wise skin detection accuracy. Experimental results show that by using Random Forest classifier, the proposed SKN color space obtained an average F-score and True Positive Rate of 0.953 and False Positive Rate of 0.0482 which outperformed the existing color spaces in terms of pixel wise skin detection accuracy. The results also indicate that among the classifiers used in this study, Random Forest is the most suitable classifier for pixel wise skin detection applications.

Matched MeSH terms: Principal Component Analysis*
Effect of data pre-treatment procedures on principal component analysis: a case study for mangrove surface sediment datasets

Praveena SM, Kwan OW, Aris AZ

Environ Monit Assess, 2012 Nov;184(11):6855-68.
PMID: 22146822 DOI: 10.1007/s10661-011-2463-2

Principal component analysis (PCA) is capable of handling large sets of data. However, lack of consistent method in data pre-treatment and its importance are the limitations in PCA applications. This study examined pre-treatments methods (log (x + 1) transformation, outlier removal, and granulometric and geochemical normalization) on dataset of Mengkabong Lagoon, Sabah, mangrove surface sediment at high and low tides. The study revealed that geochemical normalization using Al with outliers removal resulted in a better classification of the mangrove surface sediment than that outliers removal, granulometric normalization using clay and log (x + 1) transformation. PCA output using geochemical normalization with outliers removal demonstrated associations between environmental variables and tides of mangrove surface sediment, Mengkabong Lagoon, Sabah. The PCA outputs at high and low tides also provided to better interpret information about the sediment and its controlling factors in the intertidal zone. The study showed data pre-treatment method to be a useful procedure to standardize the datasets and reducing the influence of outliers.

Matched MeSH terms: Principal Component Analysis/methods*
Fulltext Dataset of Fourier transform-infrared coupled with chemometric analysis used to distinguish accessions of Garcinia mangostana L. in Peninsular Malaysia

Samsir SA, Bunawan H, Yen CC, Noor NM

Data Brief, 2016 Sep;8:1-5.
PMID: 27257614 DOI: 10.1016/j.dib.2016.04.062

In this dataset, we distinguish 15 accessions of Garcinia mangostana from Peninsular Malaysia using Fourier transform-infrared spectroscopy coupled with chemometric analysis. We found that the position and intensity of characteristic peaks at 3600-3100 cm(-) (1) in IR spectra allowed discrimination of G. mangostana from different locations. Further principal component analysis (PCA) of all the accessions suggests the two main clusters were formed: samples from Johor, Melaka, and Negeri Sembilan (South) were clustered together in one group while samples from Perak, Kedah, Penang, Selangor, Kelantan, and Terengganu (North and East Coast) were in another clustered group.

Matched MeSH terms: Principal Component Analysis
A study of community structure and beta diversity of epiphyllous liverwort assemblages in Sabah, Malaysian Borneo

Pócs T, Lee GE, Podani J, Pesiu E, Havasi J, Tang HY, et al.

PhytoKeys, 2020;153:63-83.
PMID: 32765181 DOI: 10.3897/phytokeys.153.53637

We evaluated the species richness and beta diversity of epiphyllous assemblages from three selected localities in Sabah, i.e. Mt. Silam in Sapagaya Forest Reserve, and Ulu Senagang and Mt. Alab in Crocker Range Park. A total of 98 species were found and a phytosociological survey was carried out based on the three study areas. A detailed statistical analysis including standard correlation and regression analyses, ordination of species and leaves using centered principal component analysis, and the SDR simplex method to evaluate the beta diversity, was conducted. Beta diversity is very high in the epiphyllous liverwort assemblages in Sabah, with species replacement as the major component of pattern formation and less pronounced richness difference. The community analysis of the epiphyllous communities in Sabah makes possible their detailed description and comparison with similar communities of other continents.

Matched MeSH terms: Principal Component Analysis
Fulltext The use of principal component and cluster analysis to differentiate banana peel flours based on their starch and dietary fibre components

Ramli S, Ismail N, Alkarkhi AF, Easa AM

Trop Life Sci Res, 2010 Aug;21(1):91-100.
PMID: 24575193 MyJurnal

Banana peel flour (BPF) prepared from green or ripe Cavendish and Dream banana fruits were assessed for their total starch (TS), digestible starch (DS), resistant starch (RS), total dietary fibre (TDF), soluble dietary fibre (SDF) and insoluble dietary fibre (IDF). Principal component analysis (PCA) identified that only 1 component was responsible for 93.74% of the total variance in the starch and dietary fibre components that differentiated ripe and green banana flours. Cluster analysis (CA) applied to similar data obtained two statistically significant clusters (green and ripe bananas) to indicate difference in behaviours according to the stages of ripeness based on starch and dietary fibre components. We concluded that the starch and dietary fibre components could be used to discriminate between flours prepared from peels obtained from fruits of different ripeness. The results were also suggestive of the potential of green and ripe BPF as functional ingredients in food.

Matched MeSH terms: Principal Component Analysis
Fulltext Accuracy Improvement for Predicting Parkinson's Disease Progression

Nilashi M, Ibrahim O, Ahani A

Sci Rep, 2016 Sep 30;6:34181.
PMID: 27686748 DOI: 10.1038/srep34181

Parkinson's disease (PD) is a member of a larger group of neuromotor diseases marked by the progressive death of dopamineproducing cells in the brain. Providing computational tools for Parkinson disease using a set of data that contains medical information is very desirable for alleviating the symptoms that can help the amount of people who want to discover the risk of disease at an early stage. This paper proposes a new hybrid intelligent system for the prediction of PD progression using noise removal, clustering and prediction methods. Principal Component Analysis (PCA) and Expectation Maximization (EM) are respectively employed to address the multi-collinearity problems in the experimental datasets and clustering the data. We then apply Adaptive Neuro-Fuzzy Inference System (ANFIS) and Support Vector Regression (SVR) for prediction of PD progression. Experimental results on public Parkinson's datasets show that the proposed method remarkably improves the accuracy of prediction of PD progression. The hybrid intelligent system can assist medical practitioners in the healthcare practice for early detection of Parkinson disease.

Matched MeSH terms: Principal Component Analysis
Fulltext Classification of car paint primers using Pyrolysis-Gas Chromatography-Mass Spectrometry (Py-GC-MS) and chemometric techniques

Raja Zubaidah Raja Sabaradin, Norashikin Saim, Rozita Osman, Hafizan Juahir

Pertanika Journal of Science & Technology, 2017;25(107):53-66.
MyJurnal

Pyrolysis-gas chromatography-mass spectrometry (Py-GC-MS) has been recognised as an effective technique to analyse car paint. This study was conducted to assess the combination of Py-GC-MS and chemometric techniques to classify car paint primer, the inner layer of car paint system. Fifty car paint primer samples from various manufacturers were analysed using Py-GC-MS, and data set of identified pyrolysis products was subjected to principal component analysis (PCA) and discriminant analysis (DA). The PCA rendered 16 principal components with 86.33% of the total variance. The DA was useful to classify the car paint primer samples according to their types (1k and 2k primer) with 100% correct classification in the test set for all three modes (standard, stepwise forward and stepwise backward). Three compounds, indolizine, 1,3-benzenedicarbonitrile and p-terphenyl, were the most significant compounds in discriminating the car paint primer samples.

Matched MeSH terms: Principal Component Analysis
Fulltext New smoothed location models integrated with PCA and two types of MCA for handling large number of mixed continuous and binary variables

Hamid, H., Ngu, P.A.H., Alipiah, F.M.

Pertanika Journal of Science & Technology, 2018;26(1):247-260.
MyJurnal

The issue of classifying objects into groups when measured variables in an experiment are mixed has attracted the attention of statisticians. The Smoothed Location Model (SLM) appears to be a popular classification method to handle data containing both continuous and binary variables simultaneously. However, SLM is infeasible for a large number of binary variables due to the occurrence of numerous empty cells. Therefore, this study aims to construct new SLMs by integrating SLM with two variable extraction techniques, Principal Component Analysis (PCA) and two types of Multiple Correspondence Analysis (MCA) in order to reduce the large number of mixed variables, primarily the binary ones. The performance of the newly constructed models, namely the SLM+PCA+Indicator MCA and SLM+PCA+Burt MCA are examined based on misclassification rate. Results from simulation studies for a sample size of n=60 show that the SLM+PCA+Indicator MCA model provides perfect classification when the sizes of binary variables (b) are 5 and 10. For b=20, the SLM+PCA+Indicator MCA model produces misclassification rates of 0.3833, 0.6667 and 0.3221 for n=60, n=120 and n=180, respectively. Meanwhile, the SLM+PCA+Burt MCA model provides a perfect classification when the sizes of the binary variables are 5, 10, 15 and 20 and yields a small misclassification rate as 0.0167 when b=25. Investigations into real dataset demonstrate that both of the newly constructed models yield low misclassification rates with 0.3066 and 0.2336 respectively, in which the SLM+PCA+Burt MCA model performed the best among all the classification methods compared. The findings reveal that the two new models of SLM integrated with two variable extraction techniques can be good alternative methods for classification purposes in handling mixed variable problems, mainly when dealing with large binary variables.

Matched MeSH terms: Principal Component Analysis
Fulltext Data on Fourier transform-infrared of Cosmos caudatus Kunth. tissues analyzed with chemometric analysis

Gunasekaran D, Bunawan H, Ismail I, Noor NM

Data Brief, 2018 Aug;19:1423-1427.
PMID: 30229014 DOI: 10.1016/j.dib.2018.06.025

In this dataset, we differentiate four different tissues of Cosmos caudatus Kunth (leaves, flowers, stem and root) obtained from UKM Bangi plot, based on Fourier transform-infrared spectroscopy. Different tissues of C. caudatus demonstrated the position and intensity of characteristic peaks at 4000-450 cm-1. Principal component analysis (PCA) shows three main groups were formed. The samples from leaves and flowers were found to be clustered together in one group, while the samples from stems and roots were clustered into two separate groups, respectively. This data provides an insight into the fingerprint identification and distribution of metabolites in the different organs of this species.

Matched MeSH terms: Principal Component Analysis
Statistical shape modelling of the first carpometacarpal joint reveals high variation in morphology

Rusli WMR, Kedgley AE

Biomech Model Mechanobiol, 2020 Aug;19(4):1203-1210.
PMID: 31754950 DOI: 10.1007/s10237-019-01257-8

The first carpometacarpal (CMC) joint, located at the base of the thumb and formed by the junction between the first metacarpal and trapezium, is a common site for osteoarthritis of the hand. The shape of both the first metacarpal and trapezium contributes to the intrinsic bony stability of the joint, and variability in the morphology of both these bones can affect the joint's function. The objectives of this study were to quantify the morphological variation in the complete metacarpal and trapezium and determine any correlation between anatomical features of these two components of the first CMC joint. A multi-object statistical shape modelling pipeline, consisting of scaling, hierarchical rigid registration, non-rigid registration and projection pursuit principal component analysis, was implemented. Four anatomical measures were quantified from the shape model, namely the first metacarpal articular tilt and torsion angles and the trapezium length and width. Variations in the first metacarpal articular tilt angle (- 6.3°

Matched MeSH terms: Principal Component Analysis

Filters

Please provide feedback to Administrator (afdal@afpm.org.my)

External Links