BACKGROUND: The aim of this paper was to review the types of approaches currently utilized in the analysis of multi-country survey data, specifically focusing on design and modeling issues with a focus on analyses of significant multi-country surveys published in 2010.
METHODS: A systematic search strategy was used to identify the 10 multi-country surveys and the articles published from them in 2010. The surveys were selected to reflect diverse topics and foci; and provide an insight into analytic approaches across research themes. The search identified 159 articles appropriate for full text review and data extraction.
RESULTS: The analyses adopted in the multi-country surveys can be broadly classified as: univariate/bivariate analyses, and multivariate/multivariable analyses. Multivariate/multivariable analyses may be further divided into design- and model-based analyses. Of the 159 articles reviewed, 129 articles used model-based analysis, 30 articles used design-based analyses. Similar patterns could be seen in all the individual surveys.
CONCLUSION: While there is general agreement among survey statisticians that complex surveys are most appropriately analyzed using design-based analyses, most researchers continued to use the more common model-based approaches. Recent developments in design-based multi-level analysis may be one approach to include all the survey design characteristics. This is a relatively new area, however, and there remains statistical, as well as applied analytic research required. An important limitation of this study relates to the selection of the surveys used and the choice of year for the analysis, i.e., year 2010 only. There is, however, no strong reason to believe that analytic strategies have changed radically in the past few years, and 2010 provides a credible snapshot of current practice.
Correct identification of ethnicity is central to many epidemiologic analyses. Unfortunately, ethnicity data are often missing. Successful classification typically relies on large databases (n > 500,000 names) of known name-ethnicity associations. We propose an alternative naïve Bayesian strategy that uses substrings of full names. Name and ethnicity data for Malays, Indians, and Chinese were provided by a health and demographic surveillance site operating in Malaysia from 2011-2013. The data comprised a training data set (n = 10,104) and a test data set (n = 9,992). Names were spliced into contiguous 3-letter substrings, and these were used as the basis for the Bayesian analysis. Performance was evaluated on both data sets using Cohen's κ and measures of sensitivity and specificity. There was little difference between the classification performance in the training and test data (κ = 0.93 and 0.94, respectively). For the test data, the sensitivity values for the Malay, Indian, and Chinese names were 0.997, 0.855, and 0.932, respectively, and the specificity values were 0.907, 0.998, and 0.997, respectively. A naïve Bayesian strategy for the classification of ethnicity is promising. It performs at least as well as more sophisticated approaches. The possible application to smaller data sets is particularly appealing. Further research examining other substring lengths and other ethnic groups is warranted.
Measures of household socio-economic position (SEP) are widely used in health research. There exist a number of approaches to their measurement, with Principal Components Analysis (PCA) applied to a basket of household assets being one of the most common. PCA, however, carries a number of assumptions about the distribution of the data which may be untenable, and alternative, non-parametric, approaches may be preferred. Mokken scale analysis is a non-parametric, item response theory approach to scale development which appears never to have been applied to household asset data. A Mokken scale can be used to rank order items (measures of wealth) as well as households. Using data on household asset ownership from a national sample of 4,154 consenting households in the World Health Survey from Vietnam, 2003, we construct two measures of household SEP. Seventeen items asking about assets, and utility and infrastructure use were used. Mokken Scaling and PCA were applied to the data. A single item measure of total household expenditure is used as a point of contrast.
OBJECTIVES: Measuring the intraclass correlation coefficient (ICC) and design effect (DE) may help to modify the public health interventions for body mass index (BMI), physical activity and diet according to geographic targeting of interventions in different countries. The purpose of this study was to quantify the level of clustering and DE in BMI, physical activity and diet in 56 low-income, middle-income and high-income countries.
DESIGN: Cross-sectional study design.
SETTING: Multicountry national survey data.
METHODS: The World Health Survey (WHS), 2003, data were used to examine clustering in BMI, physical activity in metabolic equivalent of task (MET) and diet in fruits and vegetables intake (FVI) from low-income, middle-income and high-income countries. Multistage sampling in the WHS used geographical clusters as primary sampling units (PSU). These PSUs were used as a clustering or grouping variable in this analysis. Multilevel intercept only regression models were used to calculate the ICC and DE for each country.
RESULTS: The median ICC (0.039) and median DE (1.82) for BMI were low; however, FVI had a higher median ICC (0.189) and median DE (4.16). For MET, the median ICC was 0.141 and median DE was 4.59. In some countries, however, the ICC and DE for BMI were large. For instance, South Africa had the highest ICC (0.39) and DE (11.9) for BMI, whereas Uruguay had the highest ICC (0.434) for MET and Ethiopia had the highest ICC (0.471) for FVI.
CONCLUSIONS: This study shows that across a wide range of countries, there was low area level clustering for BMI, whereas MET and FVI showed high area level clustering. These results suggested that the country level clustering effect should be considered in developing preventive approaches for BMI, as well as improving physical activity and healthy diets for each country.
KEYWORDS: Body Mass Index (BMI); Intraclass correlation coefficient (ICC); Physical activity (METs)
Study name: World Health Survey (Malaysia is a study site)
BACKGROUND: This study explores the relationship between BMI and national-wealth and the cross-level interaction effect of national-wealth and individual household-wealth using multilevel analysis.
METHODS: Data from the World Health Survey conducted in 2002-2004, across 70 low-, middle- and high-income countries was used. Participants aged 18 years and over were selected using multistage, stratified cluster sampling. BMI was used as outcome variable. The potential determinants of individual-level BMI were participants' sex, age, marital-status, education, occupation, household-wealth and location(rural/urban) at the individual-level. The country-level factors used were average national income (GNI-PPP) and income inequality (Gini-index). A two-level random-intercepts and fixed-slopes model structure with individuals nested within countries was fitted, treating BMI as a continuous outcome.
RESULTS: The weighted mean BMI and standard-error of the 206,266 people from 70-countries was 23.90 (4.84). All the low-income countries were below the 25.0 mean BMI level and most of the high-income countries were above. All wealthier quintiles of household-wealth had higher scores in BMI than lowest quintile. Each USD10000 increase in GNI-PPP was associated with a 0.4 unit increase in BMI. The Gini-index was not associated with BMI. All these variables explained 28.1% of country-level, 4.9% of individual-level and 7.7% of total variance in BMI. The cross-level interaction effect between GNI-PPP and household-wealth was significant. BMI increased as the GNI-PPP increased in first four quintiles of household-wealth. However, the BMI of the wealthiest people decreased as the GNI-PPP increased.
CONCLUSION: Both individual-level and country-level factors made an independent contribution to the BMI of the people. Household-wealth and national-income had significant interaction effects.
Study name: World Health Survey (Malaysia is a study site)
Forecasting higher than expected numbers of health events provides potentially valuable insights in its own right, and may contribute to health services management and syndromic surveillance. This study investigates the use of quantile regression to predict higher than expected respiratory deaths. Data taken from 70,830 deaths occurring in New York were used. Temporal, weather and air quality measures were fitted using quantile regression at the 90th-percentile with half the data (in-sample). Four QR models were fitted: an unconditional model predicting the 90th-percentile of deaths (Model 1), a seasonal/temporal (Model 2), a seasonal, temporal plus lags of weather and air quality (Model 3), and a seasonal, temporal model with 7-day moving averages of weather and air quality. Models were cross-validated with the out of sample data. Performance was measured as proportionate reduction in weighted sum of absolute deviations by a conditional, over unconditional models; i.e., the coefficient of determination (R1). The coefficient of determination showed an improvement over the unconditional model between 0.16 and 0.19. The greatest improvement in predictive and forecasting accuracy of daily mortality was associated with the inclusion of seasonal and temporal predictors (Model 2). No gains were made in the predictive models with the addition of weather and air quality predictors (Models 3 and 4). However, forecasting models that included weather and air quality predictors performed slightly better than the seasonal and temporal model alone (i.e., Model 3 > Model 4 > Model 2) This study provided a new approach to predict higher than expected numbers of respiratory related-deaths. The approach, while promising, has limitations and should be treated at this stage as a proof of concept.
The concept of forecasting asthma using humans as animal sentinels is uncommon. This study explores the plausibility of predicting future asthma daily admissions using retrospective data in London (2005-2006). Negative binomial regressions were used in modeling; allowing the non-contiguous autoregressive components. Selected lags were based on partial autocorrelation function (PACF) plot with a maximum lag of 7 days. The model was contrasted with naïve historical and seasonal models. All models were cross validated. Mean daily asthma admission in 2005 was 27.9 and in 2006 it was 28.9. The lags 1, 2, 3, 6 and 7 were independently associated with daily asthma admissions based on their PACF plots. The lag model prediction of peak admissions were often slightly out of synchronization with the actual data, but the days of greater admissions were better matched than the days of lower admissions. A further investigation across various populations is necessary.
Asthma is a global public health problem and the most common chronic disease among children. The factors associated with the condition are diverse, and environmental factors appear to be the leading cause of asthma exacerbation and its worsening disease burden. However, it remains unknown how changes in the environment affect asthma over time, and how temporal or environmental factors predict asthma events. The methodologies for forecasting asthma and other similar chronic conditions are not comprehensively documented anywhere to account for semistructured noncausal forecasting approaches. This paper highlights and discusses practical issues associated with asthma and the environment, and suggests possible approaches for developing decision-making tools in the form of semistructured black-box models, which is relatively new for asthma. Two statistical methods which can potentially be used in predictive modeling and health forecasting for both anticipated and peak events are suggested. Importantly, this paper attempts to bridge the areas of epidemiology, environmental medicine and exposure risks, and health services provision. The ideas discussed herein will support the development and implementation of early warning systems for chronic respiratory conditions in large populations, and ultimately lead to better decision-making tools for improving health service delivery.
Health forecasting is a novel area of forecasting, and a valuable tool for predicting future health events or situations such as demands for health services and healthcare needs. It facilitates preventive medicine and health care intervention strategies, by pre-informing health service providers to take appropriate mitigating actions to minimize risks and manage demand. Health forecasting requires reliable data, information and appropriate analytical tools for the prediction of specific health conditions or situations. There is no single approach to health forecasting, and so various methods have often been adopted to forecast aggregate or specific health conditions. Meanwhile, there are no defined health forecasting horizons (time frames) to match the choices of health forecasting methods/approaches that are often applied. The key principles of health forecasting have not also been adequately described to guide the process. This paper provides a brief introduction and theoretical analysis of health forecasting. It describes the key issues that are important for health forecasting, including: definitions, principles of health forecasting, and the properties of health data, which influence the choices of health forecasting methods. Other matters related to the value of health forecasting, and the general challenges associated with developing and using health forecasting services are discussed. This overview is a stimulus for further discussions on standardizing health forecasting approaches and methods that will facilitate health care and health services delivery.
Health forecasting forewarns the health community about future health situations and disease episodes so that health systems can better allocate resources and manage demand. The tools used for developing and measuring the accuracy and validity of health forecasts commonly are not defined although they are usually adapted forms of statistical procedures. This review identifies previous typologies used in classifying the forecasting methods commonly used in forecasting health conditions or situations. It then discusses the strengths and weaknesses of these methods and presents the choices available for measuring the accuracy of health-forecasting models, including a note on the discrepancies in the modes of validation.
Effective population-level solutions to the obesity pandemic have proved elusive. In low- and middle-income countries the problem may be further challenged by the perceived internal tension between economic development and sustainable solutions which create the optimal conditions for human health and well-being. This paper discusses some of the ecological obstacles to addressing the growing problem of obesity in 'aspiring' economies, using Malaysia as a case study. The authors conclude that current measures to stimulate economic growth in Malaysia may actually be exacerbating the problem of obesity in that country. Public health solutions which address the wider context in which obesity exists are needed to change the course of this burgeoning problem.
Statins are known to reduce cardiovascular morbidity and mortality in primary and secondary prevention studies. Subsequently, a number of nonrandomised studies have shown statins improve clinical outcomes in patients with heart failure (HF). Small randomised controlled trials (RCT) also show improved cardiac function, reduced inflammation and mortality with statins in HF. However, the findings of two large RCTs do not support the evidence provided by previous studies and suggest statins lack beneficial effects in HF. Two meta-analyses have shown statins do not improve survival, whereas two others showed improved cardiac function and reduced inflammation in HF. It appears lipophilic statins produce better survival and other outcome benefits compared to hydrophilic statins. But the two types have not been compared in direct comparison trials in HF.
Health forecasting can improve health service provision and individual patient outcomes. Environmental factors are known to impact chronic respiratory conditions such as asthma, but little is known about the extent to which these factors can be used for forecasting. Using weather, air quality and hospital asthma admissions, in London (2005-2006), two related negative binomial models were developed and compared with a naive seasonal model. In the first approach, predictive forecasting models were fitted with 7-day averages of each potential predictor, and then a subsequent multivariable model is constructed. In the second strategy, an exhaustive search of the best fitting models between possible combinations of lags (0-14 days) of all the environmental effects on asthma admission was conducted. Three models were considered: a base model (seasonal effects), contrasted with a 7-day average model and a selected lags model (weather and air quality effects). Season is the best predictor of asthma admissions. The 7-day average and seasonal models were trivial to implement. The selected lags model was computationally intensive, but of no real value over much more easily implemented models. Seasonal factors can predict daily hospital asthma admissions in London, and there is a little evidence that additional weather and air quality information would add to forecast accuracy.
Statins lower serum cholesterol and are employed for primary and secondary prevention of cardiovascular events. Clinical evidence from observational studies, retrospective data, and post hoc analyses of data from large statin trials in various cardiovascular conditions, as well as small scale randomized trials, suggest survival and other outcome benefits for heart failure. Two recent large randomized controlled trials, however, appear to suggest statins do not have beneficial effects in heart failure. In addition to lowering cholesterol, statins are believed to have many pleotropic effects which could possibly influence the pathophysiology of heart failure. Following the two large trials, evidence from recent studies appears to support the use of statins in heart failure. This review discusses the role of statins in the pathophysiology of heart failure, current evidence for statin use in heart failure, and suggests directions for future research.