Most often, models will contain some sort of error. This is inescapable as it is impossible to include all the relevant variables in a model. Neither is a large number of controls a feasible approach, as the more correlated the main explanatory variable is with irrelevant controls, the less efficient is the estimate of the main causal effect (King, Keohane and Verba 1994, pg. 182).
I base my choice on controls by weighing two considerations. These are simply stateed in the words of Schrodt (2010, pg. 2) ”(...) models must always steer between the rock of
collinearity and the hard place of omitted variable bias (...)”.30 I seek the same approach by
avoiding the inclusion of controls that either have a minimal effect on non-state conflicts or face collinearity with the explanatory variables put forth. Further I seek to avoid omitted variable bias by including controls that have proved to be robust in former studies of conflict. The controls I include are population, GDP per capita, relevant groups and peace years.
Population
Within the conflict literature, there are two explanatory factors that show consistency in nearly all studies of conflict. The first one is population size and the second one is income level. Countries that have large populations are often associated with all sorts of collective violence (Collier and Hoeffler 2004; Hegre and Sambanis 2006; Raleigh and Hegre 2009, See eg). I use data from Penn World Table to construct the variable population. The original values are log transformed in order to get a more normally distributed variable.
Income level
Within the broader conflict literature, the perhaps most robust finding is the one connect- ing poverty to collective violence. Nearly all quantitative studies find a strong relationship between low income levels and collective violence, after controlling for other explanatory
4.3. OPERATIONALIZATION
variables. While the causal path explaining this relationship is contested, GDP per capita remains the most robust predictor of conflict (Hegre and Sambanis 2006, pg. 531). There is one problem with including GDP per capita as a control variable, and this is mainly due to its correlation with the explanatory variables included. This issue is largely due to the interwoven relationship between income, political institutions and state capacity. Within the literature, there is evidence linking all these three together. Several decades ago, Lipset (1959) argued that high levels of income lead to a higher probability for states to democratize. Since then, the relationship have been confirmed in a number of studies since (Przeworski et al. 2000; Welzel and Inglehart 2006, See eg.).
My choice of including income levels as a control variable is based on weighing between two considerations. While it causes collinearity with one of the main explanatory variables, excluding the variable could lead to biased results. It is the omitted variable bias I fear the most, hence leading me to include GDP per capita as one of the control variables. The variable, GDP per capita, is log transformed in order to create a distribution closer to the normal distribution. The variable is taken from Penn World Tables (Heston and Aten 2012).
Number of Groups
When studying internal division within rebel movement, scholars find that multiple factions increase the risk of in-fighting (Bakke, Cunningham and Seymour 2012; Cunningham, Bakke and Seymour 2012; Cunningham 2013). This means that when the number of groups with divergent interests increases, so does the fighting between them. While these studies focus on fighting between co-ethnics and rebel movements, they suggest that the number of groups is of importance. I have chosen to control for the number of groups within states. A high number of groups also means an increase in the number of potential dyads fighting each other (Rudolfsen 2013). On this basis, I included a variable, relevant groups, counting the number of groups within states. The data on relevant groups are taken from GROWup (2014), which his an updated version of the Ethnic Power Relations (EPR) data (Cederman, Min and Wimmer 2009). As defined by Cederman, Gleditsch and Weidmann (2010, pg. 99), a political relevant group is ”(...) all ethnic groups for which at least one political organization exists that promotes an ethnically oriented agenda in the national political arena, or ethnic groups that are subject to political discrimination.”
Peace Years
When dealing with cross-sectional time series data, the assumption about independence be- tween observations are most likely to be broken. The perhaps most prominent example is how the level of income in one particular year hardly ever differ from the value in the prior or proceeding year. In my case, the occurrence of non-state conflicts in one year is likely to influence the occurrence in the following years. Beck and Tucker (1998) and Carter and Sig- norino (2010) argue that when scholars examine cross-section data with a binary dependent variable, they need to account for the potential of temporal dependence among observations. Failing to do so could lead to both biased results as well as imprecise results.
I use the method suggested by Carter and Signorino (2010) to deal with the problem of temporal dependency between observations. I model a cubic polynomial, based on the time
since last occurrence of non-state conflict. The variables peace years, peace yeara 2, and peace
years 3.31 Together, these three values capture the potential time dependency between the
occurrence of non-state conflicts.