VI. RESULTADOS Y DISCUSIÓN
6.4. Determinación de la actividad enzimática generada en los diferentes materiales de
Variation in product characteristics – such as quality or brand name – across locations may be reflected in differences in the final retail prices of products sold in different locations. If this is the case, then the observed price gap for a specific product across two locations will not only be driven by the cost of trade between the two locations but also by actual differences in the characteristics of the product sold in each location.
Product heterogeneity across locations may itself arise due to the effect of trade costs. According to the theorem first proposed by Alchian and Allan in 1964, the presence of a fixed per-unit transport cost, for example, that is applicable to substitutable high-quality and low-quality variants of a particular product will generate differences in the quality of products available at their source and destination. The effect of the per-unit transport cost is to reduce the relative price of the high-quality variant in the consumption destination, which tends to produce a substitution effect in favour of consuming the high-quality product at the destination. The net result is that only the high-quality products are shipped out of the source location,45 resulting in variation in product quality between the source and consumption locations.
By not including an appropriate measure of variation in product quality or characteristics across locations, the price-based regressions in the literature may attribute the portion of the price gap that is actually generated by unobserved product heterogeneity to the role of distance or border-related trade costs; resulting in omitted variable bias on the estimated distance and border coefficients. For example, quality differences within narrowly defined products are likely to rise with distance, while quality is also positively correlated with price gaps. Consequently, the estimated distance coefficient may pick up the quality effect in addition to the effect of the trade cost.
A comparison of perfectly homogenous products across locations is required to fully overcome these problems. Recognising this, most studies use disaggregated prices for a set of very narrowly defined goods and services that is common across all locations. Within this group, some studies use barcode-level data in order to precisely identify identical products sold in different stores or locations (see, for instance, Broda & Weinstein, 2008; Gopinath et al., 2011; Atkin & Donaldson,
45 The classic example of this is the so-called ‘Washington apples effect’, which results in only the good quality apples being shipped out of Washington.
72
2014). However, unobserved heterogeneity may be present even in comparisons of products with identical barcodes if, for example, products for sale at a particular location come with additional bundled services that are not available at another sale location.
Testing for bias stemming from the presence of unobserved within-product heterogeneity across locations
While the 24 narrowly defined products used in this chapter are highly similar across the countries included in the sample, they are not perfectly homogenous. In many instances, a lack of detail on brand names and descriptions of product-specific characteristics in the raw country price data means that there may be unobserved differences in the quality of individual products across countries (or even districts within countries). These differences may generate deviations in product prices between districts in different countries for ostensibly common products over and above the differences in prices that are driven by the cost of trade between the two districts.
As an initial test of the potential influence of within-product heterogeneity across locations, the quantile regression estimates obtained from the full sample of 24 nearly homogenous products (presented initially in Table 8) are compared to estimates from separate quantile regressions run for two sub-sets of the full sample: the first with only highly homogenous products; and the second with less homogenous products (see Table 10).46 The highly homogenous group consists of agricultural products (fresh fruits and vegetables) and paraffin, all of which are unlikely to differ much across districts, and are generally highly substitutable. In comparison, there is likely to be greater scope for within-product variety across districts in the case of the less homogenous products. Hence, it is anticipated that the coefficients on the distance and border variables will be smaller when estimated using the sub-set of highly homogenous products.
As expected, the border coefficients are smaller when estimated using the sub-set of highly homogenous products compared to those obtained from the full sample and the sub-set of less homogenous products (except in the case of the regressions using the mean price gap). Similarly, the values for the distance-equivalent of the border effect are largest when the sub-set of less homogenous products is used.
46 These products are distinguished using Rauch’s (1999) highly disaggregated product classification scheme (see Table A.1 in Appendix A).
73
Table 10: Comparison of quantile regression estimates – full sample versus highly homogenous products
(0.00278) (0.00490) (0.00498) (0.00406) (0.00627) (0.00642) (0.00296) (0.00484) (0.00487) border 0.195*** 0.224*** 0.227*** 0.206*** 0.201*** 0.207*** 0.188*** 0.237*** 0.239***
Notes: The dependent variable is the 95th percentile or the maximum of the absolute value of the log price difference between districts j and k in year t for bin n (using 500 distance bins). All regressions are estimated with a constant and year and product fixed effects. Robust standard errors (reported in parentheses) are clustered by distance bin. Significance at the 10 percent, 5 percent and 1 percent levels is denoted by *, ** and *** respectively.
However, contrary to expectations the distance coefficients are larger for the highly homogenous products. The larger distance estimates may be due to the dominance of fresh fruit and vegetables in the highly homogenous products sub-sample (seven of the nine homogenous products are perishable fruits or vegetables). Per unit transportation costs are typically higher for perishable foods, which would likely be reflected in a higher average distance coefficient for a sample dominated by these products. Indeed, the empirical evidence in Table B.4 in Appendix B – which reports the distance and border estimates obtained from quantile regressions using the highly homogenous products sub-set but with the fruit and vegetables excluded – suggests that this is the case.47 The estimated distance coefficients are notably smaller when the fruit and vegetable products are excluded from the highly homogenous products sample.
These results provide some preliminary evidence to suggest that unobserved within-product heterogeneity across districts may be an important factor within the sample of 24 nearly homogenous products.
47 The homogenous products sample used in these regressions now only includes two products: rice and paraffin.
74 Nearly homogenous versus perfectly homogenous products
In order to provide a more rigorous examination of the extent of the potential bias in the estimates arising due to the presence of within-product heterogeneity across districts, the same quantile regression methodology is applied to an alternative dataset of perfectly homogenous products that share exactly the same brand and unit across districts.48 Within this alternative dataset, only Botswana and Zambia share more than two or three identical products – prices for eight perfectly homogenous products are observed in both countries. The analysis in this section is thus restricted to districts in Botswana and Zambia. The perfectly homogenous products included in the dataset for these two countries are: baking powder (Royal, 100g); biscuits (Eet-Sum-Mor, 200g); coffee (Ricoffy, 250g); floor polish (Cobra (white), 400ml); margarine (Butter Cup, 250g); pilchards (Lucky Star, 155g); petroleum jelly (Vaseline Blue Seal, 50g); and shoe polish (Kiwi, 50ml).
The original sample of 24 nearly homogenous products is now also restricted to price gap observations drawn from Botswana and Zambia. This allows for a comparison of the Botswana-Zambia border coefficients obtained from estimations using the dataset of perfectly homogenous products with those estimated from regressions based on the original sample of nearly homogenous products. In this way, the sensitivity of the estimated border effect is tested to the possibility of unobserved within-product heterogeneity across countries (which will not be present for the perfectly homogenous products).
The estimation results are compared in Table 11. In both cases, the quantile regressions are estimated using the preferred specification that for the product sample selection bias. As expected, the coefficients on both the log distance and Botswana-Zambia border dummy variables are consistently larger when the sample of 24 nearly homogenous products is used. The deviation in absolute prices generated by a distance of 100km between districts either within Botswana or Zambia or between the two countries falls from 45.6% to 37.3% when the relationship is estimated using perfectly homogenous products. Similarly, the impact of crossing the Botswana-Zambia border on the price gap falls from 19.1% to 12.8%. The upward bias on the estimated coefficients is larger for the border estimate in all specifications. As a result, the distance-equivalent of the
48 As in the case of the nearly homogenous products dataset, the prices of the perfectly homogenous products are recalculated net of VAT and converted into common US$ prices. The same procedure for removing outliers (explained earlier) is implemented, resulting in the deletion of prices for 0.7% of the whole perfectly homogenous products dataset. Thereafter, the monthly common currency prices are collapsed to obtain average annual prices for each district and product combination.
75
border effect is uniformly smaller when calculated from the estimates obtained using the sample of perfectly homogenous products.
The results indicate that some of the dispersion in relative prices across districts that would have been attributed to distance-related trade costs and the border effect may actually be due to unobserved differences in product quality or within-product variety between districts in the SADC sample. This product heterogeneity generates an omitted variable bias that is not accounted for the in the main regressions.
Table 11: Comparison of quantile regression estimates – nearly homogenous versus perfectly homogenous products for Botswana-Zambia district pairs only
Full sample of 24 nearly
(0.00252) (0.00636) (0.00654) (0.00217) (0.00538) (0.00570) border 0.138*** 0.174*** 0.191*** 0.0916*** 0.109*** 0.128***
(0.00472) (0.0214) (0.0226) (0.00411) (0.0129) (0.0141)
Observations 48,703 48,703 48,703 14,978 14,978 14,978
R-squared 0.297 0.390 0.397 0.265 0.344 0.367
Distance-equivalent of the
border effect 761.0 6.0 6.9 104.6 4.1 4.8
Notes: The dependent variable is the mean, 95th percentile or maximum of the absolute value of the log price difference between districts j and k in year t for bin n (using 500 distance bins). All regressions are estimated with a constant and year and product fixed effects. Robust standard errors (reported in parentheses) are clustered by distance bin. Significance at the 10 percent, 5 percent and 1 percent levels is denoted by *, ** and *** respectively.