MyMedR

Displaying all 5 publications

Abstract:

Sort:

Fulltext A Review on Sample Size Determination for Cronbach's Alpha Test: A Simple Guide for Researchers

Bujang MA, Omar ED, Baharum NA

Malays J Med Sci, 2018 Nov;25(6):85-99.
PMID: 30914882 DOI: 10.21315/mjms2018.25.6.9

Background: Reliability studies are commonly used in questionnaire development studies and questionnaire validation studies. This study reviews the sample size guideline for Cronbach's alpha test.
Methods: Manual sample size calculation using Microsoft Excel software and sample size tables were tabulated based on a single coefficient alpha and the comparison of two coefficients alpha.
Results: For a single coefficient alpha test, the approach by assuming the Cronbach's alpha coefficient equals to zero in the null hypothesis will yield a smaller sample size of less than 30 to achieve a minimum desired effect size of 0.7. However, setting the coefficient of Cronbach's alpha larger than zero in the null hypothesis could be necessary and this will yield larger sample size. For comparison of two coefficients of Cronbach's alpha, a larger sample size is needed when testing for smaller effect sizes.
Conclusions: In the assessment of the internal consistency of an instrument, the present study proposed the Cronbach's alpha's coefficient to be set at 0.5 in the null hypothesis and hence larger sample size is needed. For comparison of two coefficients' of Cronbach's alpha, justification is needed whether testing for extremely low and extremely large effect sizes are scientifically necessary.
Sample size determination for conducting a pilot study to assess reliability of a questionnaire

Bujang MA, Omar ED, Foo DHP, Hon YK

Restor Dent Endod, 2024 Feb;49(1):e3.
PMID: 38449496 DOI: 10.5395/rde.2024.49.e3

This article is a narrative review that discusses the recommended sample size requirements to design a pilot study to assess the reliability of a questionnaire. A list of various sample size tables that are based on the kappa agreement test, intra-class correlation test and Cronbach's alpha test has been compiled together. For all calculations, type I error (alpha) was set at a maximum value of 0.05, and power was set at a minimum value of 80.0%. For the kappa agreement test, intra-class correlation test, and Cronbach's alpha test, the recommended minimum sample size requirement based on the ideal effect sizes shall be at least 15, 22, and 24 subjects respectively. By making allowances for a non-response rate of 20.0%, a minimum sample size of 30 respondents will be sufficient to assess the reliability of the questionnaire. The clear guideline of minimum sample size requirement for the pilot study to assess the reliability of a questionnaire is discussed and this will ease researchers in preparation for the pilot study. This study provides justification for a minimum requirement of a sample size of 30 respondents specifically to test the reliability of a questionnaire.
Fulltext Acute kidney injury following coronary artery bypass graft surgery in a tertiary public hospital in Malaysia: an analysis of 1228 consecutive cases

Hiew KC, Sachithanandan A, Muhammad Nor MA, Badmanaban B, Jasid AM, Ismail F, et al.

Med J Malaysia, 2016 Jun;71(3):126-30.
PMID: 27495886 MyJurnal

Acute kidney injury (AKI) following cardiac surgery is well established but the reported incidence is variable due to varying definitions and criteria. Furthermore there is a paucity of such data from Southeast Asia.
Fulltext Comparative Analysis of Logistic Regression, Gradient Boosted Trees, SVM, and Random Forest Algorithms for Prediction of Acute Kidney Injury Requiring Dialysis After Cardiac Surgery

Omar ED, Mat H, Abd Karim AZ, Sanaudi R, Ibrahim FH, Omar MA, et al.

Int J Nephrol Renovasc Dis, 2024;17:197-204.
PMID: 39070075 DOI: 10.2147/IJNRD.S461028

PURPOSE: This study aimed to identify the best-performing algorithm for predicting Acute Kidney Injury (AKI) necessitating dialysis following cardiac surgery.
PATIENTS AND METHODS: The dataset encompassed patient data from a tertiary cardiothoracic center in Malaysia between 2011 and 2015, sourced from electronic health records. Extensive preprocessing and feature selection ensured data quality and relevance. Four machine learning algorithms were applied: Logistic Regression, Gradient Boosted Trees, Support Vector Machine, and Random Forest. The dataset was split into training and validation sets and the hyperparameters were tuned. Accuracy, Area Under the ROC Curve (AUC), precision, F-measure, sensitivity, and specificity were some of the evaluation criteria. Ethical guidelines for data use and patient privacy were rigorously followed throughout the study.
RESULTS: With the highest accuracy (88.66%), AUC (94.61%), and sensitivity (91.30%), Gradient Boosted Trees emerged as the top performance. Random Forest displayed strong AUC (94.78%) and accuracy (87.39%). In contrast, the Support Vector Machine showed higher sensitivity (98.57%) with lower specificity (59.55%), but lower accuracy (79.02%) and precision (70.81%). Sensitivity (87.70%) and specificity (87.05%) were maintained in balance via Logistic Regression.
CONCLUSION: These findings imply that Gradient Boosted Trees and Random Forest might be an effective method for identifying patients who would develop AKI following heart surgery. However specific goals, sensitivity/specificity trade-offs, and consideration of the practical ramifications should all be considered when choosing an algorithm.
Fulltext A Comparative Analysis of Machine-Learning Algorithms for Automated International Classification of Diseases (ICD)-10 Coding in Malaysian Death Records

Nordin MNB, Jayaraj VJ, Ismail MZH, Omar ED, Seman Z, Yusoff YM, et al.

Cureus, 2025 Jan;17(1):e77342.
PMID: 39944445 DOI: 10.7759/cureus.77342

OBJECTIVE: This study explores machine learning (ML) for automating unstructured textual data translation into structured International Classification of Diseases (ICD)-10 codes, aiming to identify algorithms that enhance mortality data accuracy and reliability for public health decisions.
METHODS: This study analyzed death records from January 2017 to June 2022, sourced from Malaysia's Health Informatics Centre, coded into ICD-10. Data anonymization adhered to ethical standards, with 387,650 death registrations included after quality checks. The dataset, limited to three-digit ICD-10 codes, underwent cleaning and an 80:20 training-testing split. Preprocessing involved HTML tag removal and tokenization. ML approaches, including BERT (Bidirectional Encoder Representations from Transformers), Gzip+KNN (K-Nearest Neighbors), XGBoost (Extreme Gradient Boosting), TensorFlow, SVM (Support Vector Machine), and Naive Bayes, were evaluated for automated ICD-10 coding. Models were fine-tuned and assessed across accuracy, F1-score, precision, recall, specificity, and precision-recall curves using Amazon SageMaker (Amazon Web Services, Seattle, WA). Sensitivity analysis addressed unbalanced data scenarios, enhancing model robustness.
RESULTS: In assessing ICD-10 coding with ML, Gzip+KNN had the longest training time at 10 hours, with BERT leading in memory use. BERT performed best for the F1-score (0.71) and accuracy (0.82), closely followed by Gzip+KNN. TensorFlow excelled in recall, whereas SVM had the highest specificity but lower overall performance. XGBoost was notably less effective across metrics. Precision-recall analysis showed Gzip+KNN's superiority. On an unbalanced dataset, BERT and Gzip+KNN demonstrated consistent accuracy.
CONCLUSION: Our study highlights that BERT and Gzip+KNN optimize ICD-10 coding, balancing efficiency, resource use, and accuracy. BERT excels in precision with higher memory demands, while Gzip+KNN offers robust accuracy and recall. This suggests significant potential for improving healthcare analytics and decision-making through advanced ML models.