Modelling Global Burden of Disease Measures in Selected European Countries Using Robust Dynamic Spatial Panel Data Models

The aim of the paper is to study relationships between selected socio‐economic factors and health of European citizens. The health level is measured by selected global burden of disease measures – DALYs (Disability Adjusted Life Years) and its two components: YLL (Years of Life Lost) and YLD (Years Lived with Disability). We identify which factors significantly affect these indicators of health. The empirical study uses a panel data comprising 16 countries mostly from the old‐EU in the period 2003–2013. Fixed‐effects dynamic spatial panel data (DSPD) models are used to account for autocor‐ relations of the dependent variables across time and space. The models are estimated with a novel, modified quasi maximum likelihood Yang method based on M‐estimators. The approach is robust on the distribution of the initial observations. The empirical analysis covers specification, estimation, and verification of the models. The results show that changes in YLD are significantly related to al‐ cohol consumption, healthcare spending, social spending, GDP growth rate and years of education. Exactly the same set of factors is associated with variation in DALYs. Sensitivity of the YLL component to the socio‐economic factors is considerably weaker.


Introduction
Health inequality is a serious challenge for health and economic policies. According to the EU's approach "health in all policies", the long-term intersectional actions should be undertaken. "Implementing public health management programs aimed at minimizing health inequalities at national and international levels should take into account conclusions from research on identification of factors determining the health levels of various populations" (Eurostat, 2012). The health level of population can be assessed using either summary measures of expected health) or measures of lack of health (Murray et al., 2002;Robine, 2006;Wróblewska, 2008). In this paper, we use those from the latter group that is selected Global Burden Disease (GBD) indicators: Years Lived with Disability (YLD), Years of Life Lost due to premature mortality (YLL), and Disability-Adjusted Life Years (DALYs). Identification of the determinants of premature deaths and morbidity that defines the health level of a population, allows to formulate adequate health policy and health, social, and economic programs (Dahlgren, Whitehead, 2007).
There are several studies which explore the impact of the entire socio-economic environment upon health by utilizing multivariate analysis on various spatial dimensions in order to explain health inequalities (Frohlich, Mustard, 1996;Cavalini, De Leon, 2008). More recently, Orwat-Acedańska (2018) studies the problem of identifying factors affecting the DALYs measure. While the cited paper offers a valuable insight on the relationship between DALYs and some socio-economic determinants it suffers from the fact that it does not account for the dual nature of the analysed measure, namely the decomposition of the DLAY index.
This paper is aimed at filling this gap. It identifies socio-economic factors that affect not only DALYs but also its two components: YLD and YLL. It tries not only to determine factors associated with the DALYs measure but also investigate the sources of the observed dependencies -whether they can be attributed to changes in the years lived with disability or the life years lost due to premature death.
The time series of the GBD indicators for Europe are characterized by significant autocorrelation in both, time and space. The standard multiple regression or spatial model are unable to properly account for both types of dependency. Therefore, we propose using dynamic spatial panel data (DSPD) econometric models, in particular spatial dynamic panel data (SDPD) models with fixed effects and spatial autocorrelation of the error term. These models we will be called Dynamic Spatial Autoregressive Fixed Effects Models (DSAR-FEM).
These models extend the spatial panel framework with time dynamics that represents trend in a dependent variable. There are several estimation methods of the dynamic spatial panel models: Maximum Likelihood (ML) method, Generalized Method of Moments (GMM), and Method of Instrumental Variables (IV). Effective estimation techniques of various types of the spatial panel models are discussed in Elhorst (2010aElhorst ( : 377-407, 2010b and Lee, Yu (2010a: 165-185;2010b;2010c: 255-271;2010d). Maximum likelihood or quasi maximum likelihood estimators are commonly thought as more effective than GMM and IV estimators (Hsiao, Pesaran, Tahmiscioglu, 2002;Binder, Hsiao, Pesaran, 2005;Bun, Carree, 2005;Elhorst, 2010c;Gourieroux, Phillips, Yu, 2010;Kruiniger, 2013). Adding the time-dynamic effect results in bias and efficient loss of the standard estimators. This problem is particularly severe in the case of short panel (with small number of periods). The main difficulty in using ML method to estimate spatial panel data models with short panels is the modeling of the initial observations (the data generating process for the pre-sample period) because statistical properties of the ML estimators crucially depend on the assumptions on the initial observations (Dańska-Borsiak, 2011). Model for the initial differences involves the unknown process starting time. It is highly desirable to have a method that is free from the specification of initial observations and possess good statistical properties, especially in the case of short panels.
In this paper, we use novel, modified quasi maximum likelihood (QML) method with M-estimators proposed recently by Yang (2018). M-estimators are treated as a class of robust estimators known from robust statistics (Huber, 1981;Hampel et al., 1986). The robust estimators are aimed at improving estimation results in the case of deviations from the classic assumptions. The robustness of the M-estimator employed by Yang for the DSAR-FEM involves freedom from the assumptions on the distribution of the initial observations. Moreover, M-estimators are consistent and asymptotically normally distributed.
The paper is organized as follows. In the second section, we introduce the definition of YLD, YLL and DALYs measurements. We also present evolution of DALYs in European countries since 1990 in order to motivate the choice of the model class.
In the third section, we present the intuition and the exact specification of Dynamic Autoregressive Fixed Effects Models (DSAR-FEM). In this section, we also explain the M-estimation approach. The fourth section contains the empirical analysis. It consists of two subsections. First, we describe the explanatory variables and the main assumptions used in the empirical study. Then, we present and discuss the results. The last section concludes the paper.

Selected indicators of global burden of disease
The first worldwide study of burden of disease commissioned by WHO was conducted in 1990 by a group of experts led by Christopher J. L. Murray from the Institute of Health Metrics and Evaluation (Murray, Lopez, 1996a;1996b). The study resulted in developing the comprehensive regional and global research program called the Global Burden of Disease Study (GBD). It provided several measures of health burden of populations. Below, we present a few of them.

Years Lived with Disability (YLD)
Years Lived with Disability YLD refer to years lived in health worse than ideal. To estimate the YLD on a population basis, the number of disability cases is multiplied by the average duration of the disease and a weight factor that reflects the severity of the disease on a scale from 0 (perfect health) to 1 (dead). The basic formula for one disabling event is (Murray, Lopez, Alan, 1994): where: I -the number of incident cases; DW -the disability weight; L -the average duration of disability (years). The weights are calculated the person trade-off method. The average duration of a disease L takes into account a person's age and is discounted (today's health level has a higher weight compared to the future's one).
These values are then used to define 7 classes of disability and severity of several hundred treated and untreated diseases. If both age-weighting and discounting are applied, and the years between the event and the life expectancy are summed, the initially simple formula (1) for YLD become more complicated (for a single case): where: a -age of death (years), r -discount rate (usually 3%); K -age-weighting modulation constant; C -adjustment constant for age-weights; γ -age weighting constant; L -duration of disability (years); DW -disability weight.

Years of Life Lost (YLL)
The Years of Life Lost due to premature mortality (YLL) correspond to the number of deaths multiplied by the standard life expectancy at the age at which death occurs. The basic formula for YLL, for a given cause, age and sex is the following (Murray, 1996): where: N -number of deaths; M -standard life expectancy at age of death (in years).
The following data sources are utilized for calculating YLL: a death registration system (International Classification of Diseases ICD-9), epidemiological estimates, mortality by causes models, and life tables. Accounting for age-weighting and discounting formula (3) for the single death has the following form:

Disability-Adjusted Life Years (DALYs)
The DALYs measure is becoming increasingly common in the field of public health and health impact assessment. It is defined as a sum of the Years Lived with a Disability (YLD) and Years of Life Lost to premature death (YLL). One DALY is thus equal to one lost year of healthy life, where the lost may be due to premature death or occurrence of a disease or a disability. DALYs are calculated as follow (Murray, Lopez, Alan, 1994): where YLD and YLL are given by formulas (1) and (3), respectively. The DALYs measure is also widely discussed in literature (Murray, 1994;Berman, 1995;Desjarlais et al., 1995;Lozano et al., 1995;Martens et al., 1995;Barker, Green, 1996;Laurell, Arellano, 1996;Anand, Hanson, 1997;Devleesschauwer et al., 2014). The DALYs measure is helpful for identifying the main causes of burden of disease and allocating the appropriate funds for dealing with the causes. It also allows assessing the effectiveness of undertaken actions by monitoring changes in burden of disease.

GBD measures in selected European countries over the last 25 years
The DALYs, YLD, and YLL are calculated every year starting from 1990 and published by the Institute of Health Metrics and Evaluations (IHME) as a part of the Global Burden of Disease Study. These assessments represent changing reasons in burden of disease. Currently, they account for 300 diseases and injuries, 67 risk factors, and 1160 health consequences. The estimates are available for most countries from 21 regions of the world. Figure 1 shows the evolution of the DALYs measure for 26 selected European countries over the period 1990-2015. The values (lost years) are calculated per one inhabitant. The series are characterized by a downward trend. Additionally, differences between the countries can be observed -a clear separate group consists of Estonia, Hungary, Lithuania, and Latvia. These countries are characterized by significantly higher values of DALYs in the last 25 years. On the other hand, Iceland is the country with the lowest values of this measure. 5 estimates are available for most countries from 21 regions of the world. The spatial heterogeneity in the DALYs measure is shown in Figure 2. Besides the four mentioned countries the high values of DALYs are observed in other Central and Eastern Europe countries like Poland or the Czech Republic. On the other hand, the measure is considerably lower in the old EU members. Figure 3 shows the evolution of the DALYs' components. The presented series are unweighted averages for the 26 European countries. One can notice that the recent downward trend in the DALYs measure results from the decrease in YLL. On the other hand, YLD slightly but steadily rises during the whole analysed period. As a result the contribution of YLL to DALYs decreases and currently is almost equal to the contribution of YLD.  Figures 4 and 5 show the spatial heterogeneity of the DALYs components. It is worth noting that the cross-country differences in YLD is much smaller than in YLL. In the case of YLD, the range is equal to 0.02 years per inhabitant whereas in the case of YLL it is about ten times higher. On the other hand, YLD is distributed more irregular as its values are not related to the spatial locations or economic development levels.

Dynamic Spatial Error Fixed Effects Model (DSE_FEM)
We assume that the dependent variable y and k regressors x j , j = 1, 2, …, k, are observed for N spatial units and T periods. Because the investigated objects (countries in our case) are selected in a nonrandom way, we employ the spatial panel models with fixed effects. The spatial dependence is modelled with spatial autocorrelation in an error term whereas time dynamics is represented by the time-lagged dependent variable. As a result, we consider a dynamic spatial panel data model with fixed effects and spatial autocorrelation of the error term. In the literature, it is also known as Dynamic Spatial Error Fixed Effects Model (DSE_FEM) and has the following form (Elhorst, 2012): where error term has the form: where: for i = 1, …, N; t = 1, …, T; l -1, …, k, y it -the dependent variable; y it -1 -time-lagged dependent variable; ρ -the time autoregression parameter; x ijt -a regressor; β j -the parameter representing impact of regressors on the dependent variable; μ i -the fixed effects parameter; ε it , v it -the error terms; w il -an element of the spatial weight matrix; λ -the spatial autoregression parameter. The random variables v it are normally, independent and identically distributed with the expected value equal to 0.

The idea of unified M-estimation of Dynamic Spatial Error Fixed Effects Model (DSE_FEM)
The main difficulty in using ML or QML method to estimate DSE_FEM models with short panels is the modeling of the initial observations (the data generating process for the pre-sample period). Exact statistical properties of the estimators crucially depend on assumptions regarding the initial observation of the response vector (y i0 ) (for the random effects model) or the initial differences (∆y i0 ) (for the fixed effects model) (Dańska-Borsiak, 2011). Model for the initial differences involves the unknown process starting time. Moreover, its predictability typically requests that the time-varying regressors be trend or first-difference stationary.
When there are many time-varying regressors in the model, modelling the initial difference may introduce too many additional parameters, causing an efficiency decline (Yang, 2018). Yang (2018) proposed a unified initial-condition free approach to estimate the SDPD models with fixed effects. His method starts from the 'conditional' quasi-likelihood, with the initial differences being treated as if they are exogenous. Subsequently, corrections on the conditional quasi-score functions are made to give a set of unbiased estimating equations. Solving these unbiased estimating equations (EFs) leads to estimators that are consistent and asymptotically normal. The corrections on the conditional quasi scores are totally free from the specification of the distribution of the initial differences. The proposed estimator is simply referred to in this paper as the M-estimator in view of Huber (1981) or van der Vaart (1998). Therefore, the estimator can be classified as a tool of robust statistics -a branch that has been rapidly developed since 1980s. The Monte Carlo results (Yang, 2018) show that the proposed M-estimation Yang method, for the dynamic spatial panel data models with fixed effects models, is not only valid when T is small, but also provides better estimators when T is not small, compared with the conditional quasi likelihood approach.

Empirical analysis 4.1. Variables, data and the empirical procedure
We analyze three endogenous variables: DALYs, YLD, and YLL. For all the variables we consider the same set of eight regressors that represent socio-economic factors. In the literature, the socio-economic determinants of health are not defined precisely and usually they are labelled as social factors. According to the general definition of WHO: "the social determinants of health are the conditions in which people are born, grow, live, work and age. These circumstances are shaped by the distribution of money, power and resources at global, national and local levels" 1 . Therefore, in the study, the socio-economic determinants from the so-called "health areas" are selected: natural environment, lifestyle, macroeconomic environment, and healthcare 2 . 2 The factors other than socio-economic, like biology or genetics, are not taken into account in this study. This is caused by the very nature of the dependent variables -that are directly calculated using data on deaths and diseases that encompasses biological and genetical factors. 3 The variables are ascribed to the "health area", somewhat authoritatively, by the author.
Of course, the selection of the potential determinants is not exhaustive. However, the set of regressors is severely restricted by data availability for the studied period 4 .
The final set of analysed countries is determined by two criteria. First, out of 26 countries, for which the DALYs measure is depicted in Figure 1, we select those that are relatively homogeneous in terms of this measure. And second, those countries are analysed for which the complete data for the eight regressors is available. As a result, the sample consists of 16 countries, mostly from the "old EU" 5 (without Luxemburg, but with Iceland and Norway): Austria, Belgium, Denmark, Finland, France, Greece, Spain, the Netherlands, Ireland, Germany, Portugal, Sweden, the Great Britain, Italy, Iceland, and Norway. The final investigation period -years 2003-2013 -is a result of a compromise between the criteria for the selection of countries and availability of regressors data.
The exogenous variables are taken from the OECD database. All the endogenous variables come from the database of the Institute for Health Metrics and Evaluation. The spatial matrix W is created using the common border criterion. It means that first the elements w ij are set equal to one if countries i and j share the same border and zero, otherwise. Then, the matrix W is row-standardized. Standard errors of the estimates are calculated using a heteroscedasticity robust procedure proposed by Yang (2018). All the computations are carried out in Matlab using the procedures written by Yang.

Empirical results
In the first stage of the analysis, we estimate the model (6)-(7) for the DALYs measure. The estimated coefficients are presented in the second column of Table 2. The next columns contain robust standard errors of the estimates, t statistics, and p-values. The results are slightly different from those presented by Orwat-Acedańska (2018) because of a minor difference in a sample composition 6 .
Five out of eight exogenous variables turn out to be statistically significant at the significance level equal to 0,1. These are: ∆GDP (GDP growth rate), AL-COH (alcohol consumption), EDUC (years of education), H_CARE (healthcare spending), SOCIAL (social spending). Dynamic and spatial autocorrelations are also significant which supports the choice of the modeling tool.

4
For example, the lifestyle variables like smoking or consumption of fruits and vegetables, that are likely to affect the burden of disease, are not accounted for because of missing or incomplete data for most of the studied countries.

5
The term refers to the countries that form the European Union prior to the accession of the new members in 2004.

6
The cited paper includes the Czech Republic into the sample, whereas in this study this country is omitted.

Source: own calculation
To have a model in which all the variables are statistically significant, we subsequently eliminate single variables with the highest p-value until no insignificant factors left 7 . The results of this procedure are shown in Table 3. Now, all the explanatory variables are statistically significant. The DALYs measure for the analyzed population is correlated with both, macroeconomic (GDP growth rate, social and healthcare spending in relation to GDP) and social (years of education) including lifestyle (alcohol consumption) factors. Because the time autocorrelation parameter ρ is close to one, we can say that the slower decrease in the DALYs measure (its higher growth rates) is negatively associated with GDP dynamics, alcohol consumption, and social expenditures and positively related to the education level and healthcare spending. The signs on some relations might be surprising at first sight and counterintuitive but can be explained by considering the dependence of the DALYs measure's components. The significance of the dynamic and spatial autocorrelations supports our choice of the model.
In the second step, we analyse the two components of DALYs. In Table 4, the estimation results for the YLD indicator are presented.

Source: own calculation
Out of eight factors five are found to be statistically significant. These are exactly the same factors as in the case of the DALYs measure. After removing the insignificant variables, the remaining factors are shown in Table 5. And again, they are close to the estimates for DALYs. These results clearly indicate that the dependencies found between the DALYs measure and the studied factors come from the relationship between the YLD component and the factors.
The results from Table 5 show that the higher dynamics of YLD is associated with lower GDP growth rates, alcohol consumption, and social expenditures and higher levels of education and healthcare spending. For example, the negative coefficient for alcohol consumption in Table 5 can be interpreted as follows: The countries will higher alcohol consumption experience lower dynamics of YLD compared to the countries with lower alcohol consumption. This effect may be explained by higher rate of premature deaths in the case of higher alcohol consumption countries.
Finally, we conduct the similar analysis for YLL. The preliminary estimates for the full set of regressors are presented in Table 6, whereas the model with significant factors only is shown in Table 7.   In the case of the YLL measure the set of significant regressors is quite different from what is observed for the previous measures. Slower decrease (higher growth rates) in YLL is associated with lower levels of GDP, air pollution, and social spending as well as higher healthcare expenditures. For example, the negative sign of the air pollution coefficient can be interpreted as follows: YLL decrease more in the higher-polluted countries compared to those with cleaner air. This relationship may exist because the former countries were able to reduce the pollution more than latter ones.

Conclusion
In this paper, we identified socio-economic factors that are associated with the changes in the disability-adjusted life years measure of burden of disease and its two components: YLD and YLL. We used the panel data for several old-EU countries in the period 2003-2013. The dependencies were investigated using the Dynamic Spatial Autoregressive Fixed Effects Models (DSAR-FEM) that account for correlation of observations across time and space. Their parameters were estimated using a novel M-estimation method developed recently by Yang (2018).
We showed that changes in DALYs are significantly related to alcohol consumption, healthcare spending, social spending, GDP growth rate and years of education. More importantly, exactly the same factors are significantly associated with variation in the YLD component. Social and healthcare spending as well as the GDP level and air pollution are also important for studying changes in YLL but the relationships seem to be weaker compared to the former measure.
The results show that the YLD component of DALYs is considerably more sensitive to changes in the socio-economic factors affecting health level of a population in the studied countries. This implies that various policies aimed at improving population's health level are likely to affect primarily the life years lost due to disability and to much smaller extent the life years lost due to premature death.