Displaying publications 1 - 20 of 164 in total

Abstract:
Sort:
  1. Naing NN
    Malays J Med Sci, 2003 Jul;10(2):84-6.
    PMID: 23386802 MyJurnal
    There is a particular importance of determining a basic minimum required 'n' size of the sample to recognize a particular measurement of a particular population. This article has highlighted the determination of an appropriate size to estimate population parameters.
    Matched MeSH terms: Sample Size
  2. Arifin Wan Nor
    MyJurnal
    For pre-post and cross-over design analysis of numerical data, paired t-test is the simplest analysis to perform. Planning such study, it is imperative to calculate appropriate sample size required for the test to detect hypothesized difference. However, the sample size formula requires determination of standard deviation of difference, which is not commonly reported. In this article, the author guides the reader to calculation of standard deviation of difference from standard deviation of each separate occasion.
    Matched MeSH terms: Sample Size
  3. Ellappan, S., Khoo Michael, B. C.
    MyJurnal
    A multivariate control chart is a common tool used for monitoring and controlling a process whose quality is determined by several related variables. The objective of this study is to compare the performances of the multivariate exponentially weighted moving average (MEWMA) and the multivariate synthetic T2 control charts, for the case of a multivariate normally distributed process. A comparative study is made based on the average run length (ARL) performances of the control charts, using the simulation method, in order to identify the chart having the best performance in monitoring the process mean vector. The performances of the two charts, for different sample sizes and correlation coefficients, are presented in this paper. It was found that the MEWMA chart outperformed synthetic T2 chart for small shifts but the latter prevailed for moderate shifts. Both charts performed equally well for larger shifts. In addition, the performances of both MEWMA and synthetic T2 charts were found to be influenced by sample size and correlation coefficient. The two charts’ performances improved as the sample size and correlation coefficient increased for small and moderate shifts, but the charts’ performances did not depend on sample size and correlation coefficient when the shift was large.
    Matched MeSH terms: Sample Size
  4. Ainur, A.K., Sayang, M.D., Jannoo, Z., Yap, B.W.
    MyJurnal
    A Structural Equation Model (SEM) is often used to test whether a hypothesised theoretical model agrees with data by examining the model fit. This study investigates the effect of sample size and distribution of data (normal and non-normal) on goodness of fit measures in structural equation model. Simulation results confirm that the GoF measures are affected by sample size, whereas they are quite robust when data are not normal. Absolute measures (GFI, AGFI, RMSEA) are more affected by sample size while incremental fit measures such as TLI and CFI are less affected by sample size and non-normality.
    Matched MeSH terms: Sample Size
  5. Hashibah Hamid, Long MM, Sharipah Soaad Syed Yahaya
    Sains Malaysiana, 2017;46:1001-1010.
    The location model proposed in the past is a predictive discriminant rule that can classify new observations into one
    of two predefined groups based on mixtures of continuous and categorical variables. The ability of location model to
    discriminate new observation correctly is highly dependent on the number of multinomial cells created by the number
    of categorical variables. This study conducts a preliminary investigation to show the location model that uses maximum
    likelihood estimation has high misclassification rate up to 45% on average in dealing with more than six categorical
    variables for all 36 data tested. Such model indicated highly incorrect prediction as this model performed badly for
    large categorical variables even with large sample size. To alleviate the high rate of misclassification, a new strategy
    is embedded in the discriminant rule by introducing nonlinear principal component analysis (NPCA) into the classical
    location model (cLM), mainly to handle the large number of categorical variables. This new strategy is investigated
    on some simulation and real datasets through the estimation of misclassification rate using leave-one-out method. The
    results from numerical investigations manifest the feasibility of the proposed model as the misclassification rate is
    dramatically decreased compared to the cLM for all 18 different data settings. A practical application using real dataset
    demonstrates a significant improvement and obtains comparable result among the best methods that are compared. The
    overall findings reveal that the proposed model extended the applicability range of the location model as previously it
    was limited to only six categorical variables to achieve acceptable performance. This study proved that the proposed
    model with new discrimination procedure can be used as an alternative to the problems of mixed variables classification,
    primarily when facing with large categorical variables.
    Matched MeSH terms: Sample Size
  6. Tan H
    Theor Appl Genet, 1977 Jan;50(1):29-34.
    PMID: 24407495 DOI: 10.1007/BF00273794
    Estimates of general combining ability of parents for yield and girth obtained separately from seedlings and their corresponding clonal families in Phases II and IIIA of the RRIM breeding programme are compared. A highly significant positive correlation (r = 0.71***) is found between GCA estimates from seedling and clonal families for yield in Phase IIIA, but not in Phase II (r = -0.03(NS)) nor for girth (r= -0.27(NS)) in Phase IIIA. The correlations for Phase II yield and Phase IIIA girth, however, improve when the GCA estimates based on small sample size or reversed rankings are excluded.When the best selections (based on present clonal and seedling information) are compared, all five of the parents top-ranking for yield are common in Phase IIIA but only two parents are common for yield and girth in Phases II and IIIA respectively. However, only one parent for yield in Phase II and two parents for girth in Phase IIIA would, if selected on clonal performance, have been omitted from the top ranking selections made by previous workers using seedling information.These findings, therefore, justify the choice of parents based on GCA estimates for yield obtained from seedling performance. Similar justification cannot be offered for girth, for which analysis is confounded by uninterpretable site and seasonal effects.
    Matched MeSH terms: Sample Size
  7. Tan MS, Tan JW, Chang SW, Yap HJ, Abdul Kareem S, Zain RB
    PeerJ, 2016;4:e2482.
    PMID: 27688975 DOI: 10.7717/peerj.2482
    The potential of genetic programming (GP) on various fields has been attained in recent years. In bio-medical field, many researches in GP are focused on the recognition of cancerous cells and also on gene expression profiling data. In this research, the aim is to study the performance of GP on the survival prediction of a small sample size of oral cancer prognosis dataset, which is the first study in the field of oral cancer prognosis.
    Matched MeSH terms: Sample Size
  8. Naing, L., Winn, T., Rusli, B.N.
    MyJurnal
    The sample size calculation for a prevalence only needs a simple formula. However, there are a number of practical issues in selecting values for the parameters required in the formula. Several practical issues are addressed and appropriate recommendations are given. The paper also suggests the application of a software calculator that checks the normal approximation assumption and incorporates finite population correction in the sample size calculation.
    Matched MeSH terms: Sample Size
  9. Fauziah Maarof, Lim, Fong Peng, Noor Akma Ibrahim
    MyJurnal
    This paper investigates the confidence intervals of R2 MAD, the coefficient of determination based on
    median absolute deviation in the presence of outliers. Bootstrap bias-corrected accelerated (BCa)
    confidence intervals, known to have higher degree of correctness, are constructed for the mean and standard deviation of R2 MAD for samples generated from contaminated standard logistic distribution. The results indicate that by increasing the sample size and percentage of contaminants in the samples, and perturbing the location and scale of the distribution affect the lengths of the confidence intervals. The results obtained can also be used to verify the bound of R2 MAD.
    Matched MeSH terms: Sample Size
  10. Wan Nor Arifin
    MyJurnal
    One of the most common reasons why researchers seek help from statistician is sample size calculation. However despite the common believe that it only involves formula and calculation, researchers often ignore other aspects of research design that leads to proper sample size calculation. In this article, the author outlines basic steps toward sample size calculation. The author also introduces the logic behind sample size calculation for single mean and single proportion in simplified and less intimidating forms to those not statistically inclined.
    Matched MeSH terms: Sample Size
  11. Djauhari, M.A.
    ASM Science Journal, 2011;5(2):123-137.
    MyJurnal
    Almost a half century after it was introduced, Wilks’ statistic has come into application in industrial manufacturing process variability monitoring. This is an important breakthrough in the way experts monitor the variability of manufacturing processes which is vital in modern industry. It leaves behind the traditional practice characterized by the use of sample size n which equals 1, if the process variability monitoring is based on individual observations and is greater than the number of variables p if one works with subgroup observations. The use of Wilks’ statistic allows us to work with n < p. This paper contains a review on process variability monitoring based on individual observations. First, some historical backgrounds of process variability monitoring in the general scheme was reviewed before it was revealed where the philosophy of Wilks’ statistic could be further interpreted. Subsequently it was indicated that the way to monitor the process variability depended on how the variability itself was measured. Finally, a new statistic for detecting the shift in variability based on individual observations was introduced and then a new control chart was proposed. The performance of the proposed chart as compared with Wilks chart, was quite promising. Therefore, some recommendations were given to better understand the history of manufacturing process variability.
    Matched MeSH terms: Sample Size
  12. Aini Zuhra Abdul Kadir, Jafri Mohd Rohani, Matthew Oluwole Arowolo
    MyJurnal
    This study develops a Road Safety Index (RSI) for commercial bus with the aim of determining whether the
    proposed index can be beneficial to the stakeholders for the purpose of mitigating road accident and promoting road
    safety. Five risk factors which include drivers, Vehicle, Task, Hazard/Risk and Road, where three critical factors out of
    these factors, were identified as high contributing factors (Drivers, Vehicle and Road) were selected for the construction
    of RSI. Drivers risk perceptions data were collected using survey instrument with sample size (n= 465) to test the
    model and the data fits the model perfectly. The main benefits of this approach and the subsequent development of
    RSI are: (1) Enable organisations to justify the investment on road safety by providing a measurement and evaluation
    mechanism. (2) The index provides a balanced view of the impact of the three critical (DVR) risk factors that the
    management can improve upon.
    Matched MeSH terms: Sample Size
  13. Ang, Kean Hua
    MyJurnal
    Methodology is compulsory in research study that involve with the process of design,
    application, and analysis. The literature review was conducted to describe the relationship of
    sampling area, sample size, and determination of the measurement scale. The sample size can
    be determined through formula (or equation). When sample size are applied in sampling area,
    probability and non-probability sampling will be involve in determining the quantity and
    quality of data collection for research. Random probability sampling is divided into simple
    random, systematic, stages random, various stages random, and grouping; while nonprobability
    sampling can be divided into chance, aimed, quota, snowball, dimensional, critical cases, and maximum variation. Next, the measurement scale can be determined through
    normal, ordinal interval and ratio in questionnaire or interview, which all four scales will be
    determine measurements such as Likert scale, Thurstone scale, Guttman scale, and the
    difference procedures of Sematics scale in carrying out an analysis research. Therefore, the
    sample size and sampling area, and also the choice of measurement scale is important in the
    methodology for smoothing and accelerating the process of collecting and gathering data.
    Matched MeSH terms: Sample Size
  14. Teh Sin Yin, Ong Ker Hsin, Soh Keng Lin, Khoo Michael Boon Chong, Teoh Wei Li
    Sains Malaysiana, 2015;44:1067-1075.
    The existing optimal design of the fixed sampling interval S2-EWMA control chart to monitor the sample variance of a process is based on the average run length (ARL) criterion. Since the shape of the run length distribution changes with the magnitude of the shift in the variance, the median run length (MRL) gives a more meaningful explanation about the in-control and out-of-control performances of a control chart. This paper proposes the optimal design of the S2-EWMA chart, based on the MRL. The Markov chain technique is employed to compute the MRLs. The performances of the S2-EWMA chart, double sampling (DS) S2 chart and S chart are evaluated and compared. The MRL results indicated that the S2-EWMA chart gives better performance for detecting small and moderate variance shifts, while maintaining almost the same sensitivity as the DS S2 and S charts toward large variance shifts, especially when the sample size increases.
    Matched MeSH terms: Sample Size
  15. Nor Aishah Ahad, Teh SY, Abdul Rahman Othman, Che Rohani Yaacob
    Sains Malaysiana, 2011;40:1123-1127.
    In many statistical analyses, data need to be approximately normal or normally distributed. The Kolmogorov-Smirnov test, Anderson-Darling test, Cramer-von Mises test, and Shapiro-Wilk test are four statistical tests that are widely used for checking normality. One of the factors that influence these tests is the sample size. Given any test of normality mentioned, this study determined the sample sizes at which the tests would indicate that the data is not normal. The performance of the tests was evaluated under various spectrums of non-normal distributions and different sample sizes. The results showed that the Shapiro-Wilk test is the best normality test because this test rejects the null hypothesis of normality test at the smallest sample size compared to the other tests, for all levels of skewness and kurtosis of these distributions.
    Matched MeSH terms: Sample Size
  16. Samira Ehsani, Jayanthi Arasan, Noor Akma Ibrahim
    Sains Malaysiana, 2013;42:981-987.
    In this paper, we extended a repairable system model under general repair that is based on repair history, to incorporate covariates. We calculated the bias, standard error and RMSE of the parameter estimates of this model at different sample sizes using simulated data. We applied the model to a real demonstration data and tested for existence of time trend, repair and covariate effects. Following that we also conducted a coverage probability study on the Wald confidence interval estimates. Finally we conducted hypothesis testing for the parameters of the model.The results indicated that the estimation procedure is working well for the proposed model but the Wald interval should be applied with much caution.
    Matched MeSH terms: Sample Size
  17. Kiani K, Arasan J, Habshah Midi
    Sains Malaysiana, 2012;41:471-480.
    There are numerous parametric models for analyzing survival data such as exponential, Weibull, log-normal and gamma. One of such models is the Gompertz model which is widely used in biology and demography. Most of these models are extended to new forms for accommodating different types of censoring mechanisms and different types of covariates. In this paper the performance of the Gompertz model with time-dependent covariate in the presence of right censored data was studied. Moreover, the performance of the model was compared at different censoring proportions (CP) and sample sizes. Also, the model was compared with fixed covariate model. In addition, the effect of fitting a fixed covariate model wrongly to a data with time-dependent covariate was studied. Finally, two confidence interval estimation techniques, Wald and jackknife, were applied to the parameters of this model and the performance of the methods was compared.
    Matched MeSH terms: Sample Size
  18. Abdur Razzaque Mughal, Zakiyah Zain, Nazrina Aziz
    Sains Malaysiana, 2016;45:1763-1772.
    In this study, group acceptance sampling plan (GASP) proposed by Aslam et al. (2011) is redesigned where the lifetime of
    test items are following Pareto distribution of 2nd kind. The optimal plan parameters are found by considering various
    pre-determined designed parameters. The plan parameters were obtained using the optimization solution and it also
    concludes that the proposed plan is more efficient than the existing plan as it requires minimum sample size.
    Matched MeSH terms: Sample Size
  19. Zamira Hasanah Zamzuri, Mohd Syafiq Sapuan, Kamarulzaman Ibrahim
    Sains Malaysiana, 2018;47:1931-1940.
    The presence of extra zeros is commonly observed in traffic accident count data. Past research opt to the zero altered models and explain that the zeros are sourced from under reporting situation. However, there is also an argument against this statement since the zeros could be sourced from Poisson trial process. Motivated by the argument, we explore the possibility of mixing several discrete distributions that can contribute to the presence of extra zeros. Four simulation studies were conducted based on two accident scenarios and two discrete distributions: Poisson and negative binomial; by considering six combinations of proportion values correspond to low, moderate and high mean values in the distribution. The results of the simulation studies concur with the claim as the presence of extra zeros is detected in most cases of mixed Poisson and mixed negative binomial data. Data sets that are dominated by Poisson (or negative binomial) with low mean show an apparent existence of extra zeros although the sample size is only 30. An illustration using a real data set concur the same findings. Hence, it is essential to consider the mixed discrete distributions as potential distributions when dealing with count data with extra zeros. This study contributes on creating awareness of the possible alternative distributions for count data with extra zeros especially in traffic accident applications.
    Matched MeSH terms: Sample Size
  20. ChongYong, Chua, HongChoon, Ong
    MyJurnal
    Score-based structure learning algorithm is commonly used in learning the Bayesian Network. Other than searching strategy, scoring functions play a vital role in these algorithms. Many studies proposed various types of scoring functions with different characteristics. In this study, we compare the performances of five scoring functions: Bayesian Dirichlet equivalent-likelihood (BDe) score (equivalent sample size, ESS of 4 and 10), Akaike Information Criterion (AIC) score, Bayesian Information Criterion (BIC) score and K2 score. Instead of just comparing networks with different scores, we included different learning algorithms to study the relationship between score functions and greedy search learning algorithms. Structural hamming distance is used to measure the difference between networks obtained and the true network. The results are divided into two sections where the first section studies the differences between data with different number of variables and the second section studies the differences between data with different sample sizes. In general, the BIC score performs well and consistently for most data while the BDe score with an equivalent sample size of 4 performs better for data with bigger sample sizes.
    Matched MeSH terms: Sample Size
Related Terms
Filters
Contact Us

Please provide feedback to Administrator (afdal@afpm.org.my)

External Links