Contemporaneous correlation of error term 𝜇!" with 𝜈!" causes several challenges for estimation. Independently modeling production diversity and dietary diversity outcomes would ignore cross-equation error correlation, potentially reducing the precision of our estimates (Hill et al., 2008). To test for contemporaneous correlation we estimate Equations 8 and 9 independently and calculate the correlation between the residual terms from the two equations. In our estimation we use two different indicators of household dietary diversity: Food Consumption Score (FCS) and Household Dietary Diversity Score (HDDS), as described in Section 3.4. Using OLS to independently estimate the two equations with FCS as the outcome variable, we find a significant positive correlation between the residuals (𝑟 = 0.194, 𝑝 = 0.001), which suggests potential efficiency gains from estimating the two equations jointly. We therefore use Zellner's seemingly unrelated regression (SUR) technique to jointly estimate the correlates of production diversity and FCS, the first indicator of dietary diversity (Zellner, 1962). However, with HDDS as the outcome variable there is no significant correlation between the residuals from Equations 8 and 9 fitted with OLS. The same is true if we use a Poisson distribution, which may be a more appropriate fit for modeling HDDS (Snapp and Fisher, 2015; Hirvonen and Hoddinott, forthcoming). As a result, we expect no loss of efficiency from independent estimation of the correlates of production diversity and HDDS, our second dietary diversity indicator.
The second challenge arises when we consider how to estimate effects of production diversity or farm profitability on dietary diversity. One straightforward approach would be to include production diversity and profitability as explanatory variables in Equation 9 and estimate
90
coefficients using OLS. In fact, this approach is common in recent studies addressing similar research questions (Jones et al., 2014b, Sibhatu et al., 2015; Snapp and Fisher, 2015). However, we know from Equation 8 that 𝑃𝐷!" varies with 𝜇!", which is correlated with 𝜈!". Given our observation of contemporaneous cross-equation error correlation, it is readily apparent that production diversity would be an endogenous explanatory variable. In other words, unobservable household preferences for a diverse or nutritious diet are likely to impact crop and livestock management decisions, with de facto effects on production diversity. Farm profitability is also an outcome of farm management decisions, and may therefore be similarly correlated with unobservable household dietary preferences. As a result, this empirical strategy cannot unequivocally identify specific causal effects of production diversity or farm profitability on dietary diversity. However, significant coefficient estimates on production diversity and farm profitability will provide suggestive evidence in support of these mechanisms as pathways linking agricultural production and dietary diversity.
Alternatively, instrumental variable (IV) estimation can be used to identify the causal effects of an endogenous explanatory variable, producing consistent coefficient estimates if viable instruments can be found. Two recent studies apply IV approaches to address this problem of endogeneity. Dillon et al. (2015) benefit from a panel dataset that includes observations from a post-planting period and post-harvest period for a nationally-representative sample of approximately 5,000 agricultural households in Nigeria. They use data on temperature, rainfall and semi-fixed agricultural capital from the planting season to instrument for production diversity and agricultural revenue. They find statistically significant, albeit small, impacts of production diversity and agricultural revenue on post-harvest dietary diversity. Notably, the coefficients on production diversity and agricultural revenue display a downward bias in preliminary OLS models that fail to address endogeneity. The authors are not confident in their identification of production diversity effects on dietary diversity, as the excluded instruments do not pass the Sargan-Basmann test for overidentification. Furthermore, their selection of total
91
agricultural revenue, rather than net farm profit, as an indicator of agricultural income may be misleading as it does not account for production costs.
Hirvonen and Hoddinott (forthcoming) use a set of village-level instruments, including elevation, temperature and slope, plus an interaction effect between elevation and temperature, to instrument for household-level production diversity in rural Ethiopia. The authors then estimate effects of farm production diversity on an individual dietary diversity score (IDDS) for children age 6 to 59 months. Their results show significant positive effects of production diversity on IDDS, ceteris paribus. Similar to Dillon et al. (2015), the coefficient on production diversity is biased downward in preliminary OLS and Poisson models. Standard diagnostic tests indicate that this IV model is well identified.
This recent use of IV estimation shows promise for reducing bias in the identification of causal relationships between agricultural production and dietary diversity at the household level. However, our cross-sectional household survey data come from a relatively small region in the Peruvian highlands, and the dataset does not contain household-level climate data or other variables that would make for especially compelling instruments. To assuage any readers who might be particularly concerned about biased estimates due to violations of the exogeneity assumption, Appendix 5 presents impacts of production diversity on dietary diversity using farm size and farm fragmentation as instruments for production diversity. IV diagnostic tests suggest that these instruments are relevant and that the model is overidentified. However, the ability of our instruments to meet the exclusion restriction requires strong assumptions about missing markets for land rental and purchase. Although historical evidence supports this claim, it cannot be tested in our sample.
The IV approach is not our preferred specification, given its reduced efficiency and our uncertainty about its ability to reduce bias. Instead, we rely on OLS to evaluate effects of production diversity and farm profit on household dietary diversity. Results from Dillon et al. (2015) and Hirvonen and Hoddinott (forthcoming), in addition to results from our own IV
92
estimation in Appendix 5, suggest that OLS coefficient estimates will be biased downward and will therefore provide conservative estimates of the associated effects. Additional robustness checks are performed using alternative specifications listed in Appendix 8 (results not shown).
3.4 STUDY SITE AND DATA