MyMedR

Displaying publications 1 - 20 of 164 in total

Abstract:

Sort:

Fulltext Determination of sample size

Naing NN

Malays J Med Sci, 2003 Jul;10(2):84-6.
PMID: 23386802 MyJurnal

There is a particular importance of determining a basic minimum required 'n' size of the sample to recognize a particular measurement of a particular population. This article has highlighted the determination of an appropriate size to estimate population parameters.

Matched MeSH terms: Sample Size
Fulltext Calculating standard deviation of difference for determination of sample size for planned paired t-test analysis

Arifin Wan Nor

Education in Medicine Journal, 2014;6(2):62-64.
MyJurnal

For pre-post and cross-over design analysis of numerical data, paired t-test is the simplest analysis to perform. Planning such study, it is imperative to calculate appropriate sample size required for the test to detect hypothesized difference. However, the sample size formula requires determination of standard deviation of difference, which is not commonly reported. In this article, the author guides the reader to calculation of standard deviation of difference from standard deviation of each separate occasion.

Matched MeSH terms: Sample Size
Fulltext A study on the performances of multivariate exponentially weighted moving average (MEWMA) and multivariate synthetic charts

Ellappan, S., Khoo Michael, B. C.

Pertanika Journal of Science & Technology, 2014;22(2):541-552.
MyJurnal

A multivariate control chart is a common tool used for monitoring and controlling a process whose quality is determined by several related variables. The objective of this study is to compare the performances of the multivariate exponentially weighted moving average (MEWMA) and the multivariate synthetic T2 control charts, for the case of a multivariate normally distributed process. A comparative study is made based on the average run length (ARL) performances of the control charts, using the simulation method, in order to identify the chart having the best performance in monitoring the process mean vector. The performances of the two charts, for different sample sizes and correlation coefficients, are presented in this paper. It was found that the MEWMA chart outperformed synthetic T2 chart for small shifts but the latter prevailed for moderate shifts. Both charts performed equally well for larger shifts. In addition, the performances of both MEWMA and synthetic T2 charts were found to be influenced by sample size and correlation coefficient. The two charts’ performances improved as the sample size and correlation coefficient increased for small and moderate shifts, but the charts’ performances did not depend on sample size and correlation coefficient when the shift was large.

Matched MeSH terms: Sample Size
Fulltext Sample size and non-normality effects on goodness of fit measures in structural equation models

Ainur, A.K., Sayang, M.D., Jannoo, Z., Yap, B.W.

Pertanika Journal of Science & Technology, 2017;25(2):575-586.
MyJurnal

A Structural Equation Model (SEM) is often used to test whether a hypothesised theoretical model agrees with data by examining the model fit. This study investigates the effect of sample size and distribution of data (normal and non-normal) on goodness of fit measures in structural equation model. Simulation results confirm that the GoF measures are affected by sample size, whereas they are quite robust when data are not normal. Absolute measures (GFI, AGFI, RMSEA) are more affected by sample size while incremental fit measures such as TLI and CFI are less affected by sample size and non-normality.

Matched MeSH terms: Sample Size
New discrimination procedure of location model for handling large categorical variables

Hashibah Hamid, Long MM, Sharipah Soaad Syed Yahaya

Sains Malaysiana, 2017;46:1001-1010.

The location model proposed in the past is a predictive discriminant rule that can classify new observations into one
of two predefined groups based on mixtures of continuous and categorical variables. The ability of location model to
discriminate new observation correctly is highly dependent on the number of multinomial cells created by the number
of categorical variables. This study conducts a preliminary investigation to show the location model that uses maximum
likelihood estimation has high misclassification rate up to 45% on average in dealing with more than six categorical
variables for all 36 data tested. Such model indicated highly incorrect prediction as this model performed badly for
large categorical variables even with large sample size. To alleviate the high rate of misclassification, a new strategy
is embedded in the discriminant rule by introducing nonlinear principal component analysis (NPCA) into the classical
location model (cLM), mainly to handle the large number of categorical variables. This new strategy is investigated
on some simulation and real datasets through the estimation of misclassification rate using leave-one-out method. The
results from numerical investigations manifest the feasibility of the proposed model as the misclassification rate is
dramatically decreased compared to the cLM for all 18 different data settings. A practical application using real dataset
demonstrates a significant improvement and obtains comparable result among the best methods that are compared. The
overall findings reveal that the proposed model extended the applicability range of the location model as previously it
was limited to only six categorical variables to achieve acceptable performance. This study proved that the proposed
model with new discrimination procedure can be used as an alternative to the problems of mixed variables classification,
primarily when facing with large categorical variables.

Matched MeSH terms: Sample Size
Estimates of general combining ability in Hevea breeding at the Rubber Research Institute of Malaysia : I. Phases II and III A

Tan H

Theor Appl Genet, 1977 Jan;50(1):29-34.
PMID: 24407495 DOI: 10.1007/BF00273794

Estimates of general combining ability of parents for yield and girth obtained separately from seedlings and their corresponding clonal families in Phases II and IIIA of the RRIM breeding programme are compared. A highly significant positive correlation (r = 0.71***) is found between GCA estimates from seedling and clonal families for yield in Phase IIIA, but not in Phase II (r = -0.03(NS)) nor for girth (r= -0.27(NS)) in Phase IIIA. The correlations for Phase II yield and Phase IIIA girth, however, improve when the GCA estimates based on small sample size or reversed rankings are excluded.When the best selections (based on present clonal and seedling information) are compared, all five of the parents top-ranking for yield are common in Phase IIIA but only two parents are common for yield and girth in Phases II and IIIA respectively. However, only one parent for yield in Phase II and two parents for girth in Phase IIIA would, if selected on clonal performance, have been omitted from the top ranking selections made by previous workers using seedling information.These findings, therefore, justify the choice of parents based on GCA estimates for yield obtained from seedling performance. Similar justification cannot be offered for girth, for which analysis is confounded by uninterpretable site and seasonal effects.

Matched MeSH terms: Sample Size
Fulltext A genetic programming approach to oral cancer prognosis

Tan MS, Tan JW, Chang SW, Yap HJ, Abdul Kareem S, Zain RB

PeerJ, 2016;4:e2482.
PMID: 27688975 DOI: 10.7717/peerj.2482

The potential of genetic programming (GP) on various fields has been attained in recent years. In bio-medical field, many researches in GP are focused on the recognition of cancerous cells and also on gene expression profiling data. In this research, the aim is to study the performance of GP on the survival prediction of a small sample size of oral cancer prognosis dataset, which is the first study in the field of oral cancer prognosis.

Matched MeSH terms: Sample Size
Fulltext Practical issues in calculating the sample size for prevalence studies

Naing, L., Winn, T., Rusli, B.N.

Archives of Orofacial Sciences, 2006;1(1):9-14.
MyJurnal

The sample size calculation for a prevalence only needs a simple formula. However, there are a number of practical issues in selecting values for the parameters required in the formula. Several practical issues are addressed and appropriate recommendations are given. The paper also suggests the application of a software calculator that checks the normal approximation assumption and incorporates finite population correction in the sample size calculation.

Matched MeSH terms: Sample Size
Fulltext Bootstrapping the confidence intervals of R2 MAD for samples from contaminated standard logistic distribution

Fauziah Maarof, Lim, Fong Peng, Noor Akma Ibrahim

Pertanika Journal of Science & Technology, 2010;18(1):-.
MyJurnal

This paper investigates the confidence intervals of R2 MAD, the coefficient of determination based on
median absolute deviation in the presence of outliers. Bootstrap bias-corrected accelerated (BCa)
confidence intervals, known to have higher degree of correctness, are constructed for the mean and standard deviation of R2 MAD for samples generated from contaminated standard logistic distribution. The results indicate that by increasing the sample size and percentage of contaminants in the samples, and perturbing the location and scale of the distribution affect the lengths of the confidence intervals. The results obtained can also be used to verify the bound of R2 MAD.

Matched MeSH terms: Sample Size
Fulltext Introduction to sample size calculation

Wan Nor Arifin

Education in Medicine Journal, 2013;5(2):89-96.
MyJurnal

One of the most common reasons why researchers seek help from statistician is sample size calculation. However despite the common believe that it only involves formula and calculation, researchers often ignore other aspects of research design that leads to proper sample size calculation. In this article, the author outlines basic steps toward sample size calculation. The author also introduces the logic behind sample size calculation for single mean and single proportion in simplified and less intimidating forms to those not statistically inclined.

Matched MeSH terms: Sample Size
Fulltext Manufacturing process variability:a review

Djauhari, M.A.

ASM Science Journal, 2011;5(2):123-137.
MyJurnal

Almost a half century after it was introduced, Wilks’ statistic has come into application in industrial manufacturing process variability monitoring. This is an important breakthrough in the way experts monitor the variability of manufacturing processes which is vital in modern industry. It leaves behind the traditional practice characterized by the use of sample size n which equals 1, if the process variability monitoring is based on individual observations and is greater than the number of variables p if one works with subgroup observations. The use of Wilks’ statistic allows us to work with n < p. This paper contains a review on process variability monitoring based on individual observations. First, some historical backgrounds of process variability monitoring in the general scheme was reviewed before it was revealed where the philosophy of Wilks’ statistic could be further interpreted. Subsequently it was indicated that the way to monitor the process variability depended on how the variability itself was measured. Finally, a new statistic for detecting the shift in variability based on individual observations was introduced and then a new control chart was proposed. The performance of the proposed chart as compared with Wilks chart, was quite promising. Therefore, some recommendations were given to better understand the history of manufacturing process variability.

Matched MeSH terms: Sample Size
Fulltext Development and validation of road safety index for commercial bus

Aini Zuhra Abdul Kadir, Jafri Mohd Rohani, Matthew Oluwole Arowolo

Journal of Occupational Safety and Health, 2015;12(1):33-40.
MyJurnal

This study develops a Road Safety Index (RSI) for commercial bus with the aim of determining whether the
proposed index can be beneficial to the stakeholders for the purpose of mitigating road accident and promoting road
safety. Five risk factors which include drivers, Vehicle, Task, Hazard/Risk and Road, where three critical factors out of
these factors, were identified as high contributing factors (Drivers, Vehicle and Road) were selected for the construction
of RSI. Drivers risk perceptions data were collected using survey instrument with sample size (n= 465) to test the
model and the data fits the model perfectly. The main benefits of this approach and the subsequent development of
RSI are: (1) Enable organisations to justify the investment on road safety by providing a measurement and evaluation
mechanism. (2) The index provides a balanced view of the impact of the three critical (DVR) risk factors that the
management can improve upon.

Matched MeSH terms: Sample Size
Fulltext Introduction to framework metodology in research study: a comprehensive study

Ang, Kean Hua

Malaysian Journal of Social Sciences and Humanities, 2016;1(4):42-52.
MyJurnal

Methodology is compulsory in research study that involve with the process of design,
application, and analysis. The literature review was conducted to describe the relationship of
sampling area, sample size, and determination of the measurement scale. The sample size can
be determined through formula (or equation). When sample size are applied in sampling area,
probability and non-probability sampling will be involve in determining the quantity and
quality of data collection for research. Random probability sampling is divided into simple
random, systematic, stages random, various stages random, and grouping; while nonprobability
sampling can be divided into chance, aimed, quota, snowball, dimensional, critical cases, and maximum variation. Next, the measurement scale can be determined through
normal, ordinal interval and ratio in questionnaire or interview, which all four scales will be
determine measurements such as Likert scale, Thurstone scale, Guttman scale, and the
difference procedures of Sematics scale in carrying out an analysis research. Therefore, the
sample size and sampling area, and also the choice of measurement scale is important in the
methodology for smoothing and accelerating the process of collecting and gathering data.

Matched MeSH terms: Sample Size
A study on the S2 -EWMA chart for monitoring the process variance based on the MRL Performance

Teh Sin Yin, Ong Ker Hsin, Soh Keng Lin, Khoo Michael Boon Chong, Teoh Wei Li

Sains Malaysiana, 2015;44:1067-1075.

The existing optimal design of the fixed sampling interval S2-EWMA control chart to monitor the sample variance of a process is based on the average run length (ARL) criterion. Since the shape of the run length distribution changes with the magnitude of the shift in the variance, the median run length (MRL) gives a more meaningful explanation about the in-control and out-of-control performances of a control chart. This paper proposes the optimal design of the S2-EWMA chart, based on the MRL. The Markov chain technique is employed to compute the MRLs. The performances of the S2-EWMA chart, double sampling (DS) S2 chart and S chart are evaluated and compared. The MRL results indicated that the S2-EWMA chart gives better performance for detecting small and moderate variance shifts, while maintaining almost the same sensitivity as the DS S2 and S charts toward large variance shifts, especially when the sample size increases.

Matched MeSH terms: Sample Size
Sensitivity of normality tests to non-normal data

Nor Aishah Ahad, Teh SY, Abdul Rahman Othman, Che Rohani Yaacob

Sains Malaysiana, 2011;40:1123-1127.

In many statistical analyses, data need to be approximately normal or normally distributed. The Kolmogorov-Smirnov test, Anderson-Darling test, Cramer-von Mises test, and Shapiro-Wilk test are four statistical tests that are widely used for checking normality. One of the factors that influence these tests is the sample size. Given any test of normality mentioned, this study determined the sample sizes at which the tests would indicate that the data is not normal. The performance of the tests was evaluated under various spectrums of non-normal distributions and different sample sizes. The results showed that the Shapiro-Wilk test is the best normality test because this test rejects the null hypothesis of normality test at the smallest sample size compared to the other tests, for all levels of skewness and kurtosis of these distributions.

Matched MeSH terms: Sample Size
Modeling repairable system failure with repair history and covariates

Samira Ehsani, Jayanthi Arasan, Noor Akma Ibrahim

Sains Malaysiana, 2013;42:981-987.

In this paper, we extended a repairable system model under general repair that is based on repair history, to incorporate covariates. We calculated the bias, standard error and RMSE of the parameter estimates of this model at different sample sizes using simulated data. We applied the model to a real demonstration data and tested for existence of time trend, repair and covariate effects. Following that we also conducted a coverage probability study on the Wald confidence interval estimates. Finally we conducted hypothesis testing for the parameters of the model.The results indicated that the estimation procedure is working well for the proposed model but the Wald interval should be applied with much caution.

Matched MeSH terms: Sample Size
Interval estimations for parameters of gompertz model with time-dependent covariate and right censored data

Kiani K, Arasan J, Habshah Midi

Sains Malaysiana, 2012;41:471-480.

There are numerous parametric models for analyzing survival data such as exponential, Weibull, log-normal and gamma. One of such models is the Gompertz model which is widely used in biology and demography. Most of these models are extended to new forms for accommodating different types of censoring mechanisms and different types of covariates. In this paper the performance of the Gompertz model with time-dependent covariate in the presence of right censored data was studied. Moreover, the performance of the model was compared at different censoring proportions (CP) and sample sizes. Also, the model was compared with fixed covariate model. In addition, the effect of fitting a fixed covariate model wrongly to a data with time-dependent covariate was studied. Finally, two confidence interval estimation techniques, Wald and jackknife, were applied to the parameters of this model and the performance of the methods was compared.

Matched MeSH terms: Sample Size
Time truncated efficient testing strategy for pareto distribution of them 2nd kind using weighted poisson and poisson distribution

Abdur Razzaque Mughal, Zakiyah Zain, Nazrina Aziz

Sains Malaysiana, 2016;45:1763-1772.

In this study, group acceptance sampling plan (GASP) proposed by Aslam et al. (2011) is redesigned where the lifetime of
test items are following Pareto distribution of 2nd kind. The optimal plan parameters are found by considering various
pre-determined designed parameters. The plan parameters were obtained using the optimization solution and it also
concludes that the proposed plan is more efficient than the existing plan as it requires minimum sample size.

Matched MeSH terms: Sample Size
The extra zeros in traffic accident data: a study on the mixture of discrete distributions

Zamira Hasanah Zamzuri, Mohd Syafiq Sapuan, Kamarulzaman Ibrahim

Sains Malaysiana, 2018;47:1931-1940.

The presence of extra zeros is commonly observed in traffic accident count data. Past research opt to the zero altered models and explain that the zeros are sourced from under reporting situation. However, there is also an argument against this statement since the zeros could be sourced from Poisson trial process. Motivated by the argument, we explore the possibility of mixing several discrete distributions that can contribute to the presence of extra zeros. Four simulation studies were conducted based on two accident scenarios and two discrete distributions: Poisson and negative binomial; by considering six combinations of proportion values correspond to low, moderate and high mean values in the distribution. The results of the simulation studies concur with the claim as the presence of extra zeros is detected in most cases of mixed Poisson and mixed negative binomial data. Data sets that are dominated by Poisson (or negative binomial) with low mean show an apparent existence of extra zeros although the sample size is only 30. An illustration using a real data set concur the same findings. Hence, it is essential to consider the mixed discrete distributions as potential distributions when dealing with count data with extra zeros. This study contributes on creating awareness of the possible alternative distributions for count data with extra zeros especially in traffic accident applications.

Matched MeSH terms: Sample Size
Fulltext Comparison of scoring functions on greedy search Bayesian network learning algorithms

ChongYong, Chua, HongChoon, Ong

Pertanika Journal of Science & Technology, 2017;25(3):719-734.
MyJurnal

Score-based structure learning algorithm is commonly used in learning the Bayesian Network. Other than searching strategy, scoring functions play a vital role in these algorithms. Many studies proposed various types of scoring functions with different characteristics. In this study, we compare the performances of five scoring functions: Bayesian Dirichlet equivalent-likelihood (BDe) score (equivalent sample size, ESS of 4 and 10), Akaike Information Criterion (AIC) score, Bayesian Information Criterion (BIC) score and K2 score. Instead of just comparing networks with different scores, we included different learning algorithms to study the relationship between score functions and greedy search learning algorithms. Structural hamming distance is used to measure the difference between networks obtained and the true network. The results are divided into two sections where the first section studies the differences between data with different number of variables and the second section studies the differences between data with different sample sizes. In general, the BIC score performs well and consistently for most data while the BDe score with an equivalent sample size of 4 performs better for data with bigger sample sizes.

Matched MeSH terms: Sample Size

Filters

Please provide feedback to Administrator (afdal@afpm.org.my)

External Links