II. La IED y sus disposiciones ambientales en el
3. Regulaciones ambientales a la IED en el TLCAN
In this section we consider strategies for each of the sources of outliers described above.
4.1.3.1 Deterministic causes
Unfortunately we have no way of identifying which, if any, data may be subject to such errors. So we discount the possibility of simply rejecting observations on the grounds of unlikelihood and impracticability, and rely on methods robust to the presence of outliers.
4.1.3.2 Inherent variability
As will be seen in Section 4.2.6, the methods that we will adopt will automatically indicate whether an alternative should be considered (though not necessarily what that alternative should be).
One possibility is the skewed-t distribution introduced by Jones and Faddy (2003), in which the weight of each tail is controlled by a distinct parameter; this could be fitted by maximum-likelihood methods. But our only motive for considering this is the possibility of
descriptive accuracy; we do not have any theoretical reason to suppose it valid.38
Other possibilities include special cases of generalised versions of the extreme value, logistic, and lognormal distributions. However, even if we overlook the absence of
theoretical motivation, their possibilities for descriptive accuracy are limited by their fixed
shape.39
A further alternative, attractive for its ability to model distributions with extreme skewness and kurtosis, is the stable family of distributions. This will be discussed further in Chapter Seven.
4.1.3.3 Contamination
There are two problems under this heading: first the possibility that some of our data relates to non-capitalist entities with essentially arbitrary rates of return, and second the presence of genuinely capitalist enterprises whose small size is associated with one or more distributions different to those of larger firms.
In either case the firms in question may record profit rates with small absolute magnitudes as well as extreme values, so we need a criterion by which to identify the relevant observations and either exclude or accommodate them even though they may be neither outliers nor extreme values.
There are various approaches one might take. First, one might exclude all firms with only one employee, in the hope that that will dispose of firms that are simply vehicles for independent contractors. Secondly, as we saw in our discussion of Serrano Cinco et al. (2001) in Chapter Three, Section 3, there are various ad hoc definitions of small and medium sized enterprises, some conventional in the literature and others devised for various official purposes, including the collection and classification of data.
One might either use one of these as further grounds for excluding observations, or as the basis for a formal mixture-of-distributions model and test for that. Thirdly, one might
39 Moreover, if they do happen to be valid they will automatically be detected by methods to be described
weight the data by firm size before testing, which would be tantamount to estimating what we have called the capital-level or capital space distribution.
We would definitely like to exclude non-capitalist entities: theoretically, we do not believe that they participate in capitalist communism; practically, we believe that their reported rates of return are arbitrary constructs. On the other hand, we would like to accommodate small capitalist firms in spite of their greater range of profitability. In either case, we would like to avoid arbitrary criteria as far as possible. If we had some clear theoretical basis for some exclusion criteria this would amount to a discrete, all-or-nothing weighting scheme for dealing with small firms.
But we know that variability in profit rates is an inverse function of firm size (see Chapter Three, Figures 3.9 to 3.12). Observations of small firms’ profit rates are likely to be contaminants, in the sense that such entities may not be capitalist firms, in which case we want to reject them. If we take the view that the probability of an observation’s being such a contaminant is inversely related to firm size, then a system in which an observation’s probability of inclusion in our tests depended directly on the size of the firm to which it relates constitutes a system of fuzzy rejection of (possible) contaminants. In the words of an author cited by Barnett and Lewis (1994):
Since the object of combining observations is to obtain the best possible estimate of the true value of a magnitude, the principle … [is to assign] a smaller weight than the others in computing a weighted average. Of course retention with an exceedingly small weight amounts to virtual rejection. (Rider, 1933)
In place of ‘virtual rejection’ we prefer ‘fuzzy rejection’, since not all the weights are ‘exceedingly small’, and in fact some will be close to 1 (and one exactly 1). (See also Edgeworth, 1883, Glaisher, 1872–3, Stigler, S.M., , Stone, 1873). Stone is effectively maximum likelihood estimation, and Edgeworth an independent discovery of the same.
4.1.3.4 Proneness to outliers
There are two aspects to this. The first is the tendency of some distributional models to throw up outliers in the right-hand tail, which may thus be striking, but are not discordant. The second is that, because both our candidate distributions have bounded support, observations below some threshold must be discordant. Since we do not have a clear theoretical basis on which to select such a threshold it becomes another parameter to be estimated.
As mentioned in our survey of previous contributions, the only known hypotheses about the functional form of profit rate distributions are the lognormal (Gibrat 1931) and the gamma (Farjoun and Machover 1983). Both these distributions have only bounded support on the real line, in other words they have lower thresholds. Although Farjoun and Machover claim that a lower bound of zero is to be expected (on the grounds that firms which make losses will quickly cease to be firms), it is clear from our empirical density estimates that although the proportion of firms making moderate losses is small, it is nonetheless not negligible.
Furthermore, casual observation suggests that many firms can go on making losses for some time before the creditors get restive enough to call in the receiver. In fact, what the limiting rate of loss might is a quantity estimated in estimating the threshold parameter of a distribution that successfully models the rest of the data.
While the empirical density estimates of Figures 3.3 to 3.6 might suggest that this lower bound is only moderately negative, this appearance is clearly belied by the overall data. On the other hand, as Table 3.5 shows, very moderate trimming of the left-hand tails – only half a per cent of the total observations in many cases, and one or two per cent in nearly all the others – would give minima which are much more plausible as a estimate of a rate-of- loss which might be generally sustainable in the medium term.
Thus any attempt to test the Gibrat and Farjoun and Machover hypotheses by fitting the relevant models to the data involves estimating the threshold parameter along with the
shape and scale parameters.40 Unfortunately the usual maximum likelihood methods cannot
be relied on to do this (see Appendix).
It would possible by fiat to take the minimum observed value x as the estimated0
threshold. But doing so on the basis of the data in Table 3.4, with their dramatically extended left tails, will result in estimated gamma or log-normal distributions closely approximating the normal, which on the face of Figures 3.3 to 3.6 is not a very appropriate model given the pervasive qualitative evidence of skewness; furthermore it will clearly fail to capture the kurtosis (since this is fixed at 3 for the normal).
However, a recent development in statistical technique – the method of L-moments – not only provides simple methods for threshold estimation but also offers more reliable methods for model selection than those based on the use of traditional moments.
4.1.3.5 Summary
We would like to model the distribution of profit rates across both firms and capital, in order to test Gibrat and Farjoun and Machover respectively. Above we have discussed two basic strategies for dealing with the problem of extreme values: methods robust to outliers, and weighting the data as a method of fuzzy rejection of contaminants.
In the following section we consider a method which is not only robust in the presence of extreme values but will solve the problem of estimating location parameters which we identified in our discussion of discordancy.
We then turn to our scheme of fuzzy rejection, equivalent to estimating the capital-level distributions, and show that it diminishes the extreme value problem to the extent that our robust method now has some purchase on the problem.
40 There are unbounded distributions which exhibit skewness, some of which are special cases of models which
will be considered below. An alternative would be Jones’s skewed-t, mentioned earlier, but we do not know of any theoretical motivation to consider this distribution.