Capítulo 1: Aspectos Generales de la Dimensión Social del Comercio
3. Derechos laborales en los pactos de derechos humanos
The summary statistics presented in Chapter Three suggest not only that each of our measures contains very extreme values, but that these may be outliers in the sense of being distant from the main body of the data. Of perhaps greater importance is the fact that large subsets of each data set (and not merely the two extreme values in each) are outliers in the Barnett and Lewis sense of being surprising in relation to particular models.
In this section we consider which of the sources of outliers described above may be relevant to our data.
4.1.2.1 Deterministic causes
One possible source of data error, which if present could be very large, arises from the way FAME data is presented. Normally the data is rounded to the nearest £1,000 and the trailing zeros omitted; but for firms where no figure exceeds £10,000,000 the data is expressed in actual figures (the form adopted for each firm is identified by an indicator variable UNIT).
Clearly any inconsistency in doing this for different variables within a particular record
could lead to errors of the magnitude of 103 (either up or down, depending on whether the
direction of the error and whether it occurs in an item appearing in the numerator or denominator of a particular profit measure). In practice, any such errors might be of lesser
35 However, apart from pointing out the link between deterministic causation of outliers and rejection, Barnett
magnitude, in that some measures involve numerators and/or denominators calculated from two or more variables; in these cases, the order of magnitude would be smaller, unless one supposes that the errors occur in exactly the variables forming the numerator or denominator, as the case may be.
Since any such errors may be as likely to be downward as upward, they could thus lead to discordantly small profit rate observations as well as ones which are discordantly large. The first, however, will be hard to detect, being mixed in with genuine observations of very small profit rates.
However, while the above establishes the possibility of very large recording errors, it says nothing about the probability of their occurring. We have no reason, apart from the above arguments, to suspect their presence. Since the FAME database is prepared by a commercial entity and primarily used by commercial customers who will value accuracy highly (because they may, for example, base lending and other investment decisions on the results) it seems that persistent presence of many errors is unlikely.
4.1.2.2 Inherent variability
The possibility of the outliers being due to some model of variability other than those of our two hypotheses is inherent in the notion of testing them.
4.1.2.3 Contamination
As a large literature descending from Gibrat attests (Sutton 1997), the distribution of firms by size is highly skewed by almost any measure of size. The same is true if we use the
definitions implied by the various profit-rate measures examined in this paper.36
0 5 10 15 20 25 0.0 0.05 0.10 0.15 0.20 log(k.1) D ensity 5 10 15 20 0.0 0.05 0.15 0.25 log(k.45s7s) D ensity 5 10 15 20 25 0.0 0.10 0.20 0.30 log(k.135) D ensity 5 10 15 20 25 0.0 0.05 0.15 0.25 log(k.24) D ensity
Figure 4.2: firm size distributions, selected capital measures (see text)
Figure 4.2 present histograms of the log of firm size, measured by four different capital measures (from left to right in each row, those involved in the profit rate measures Gillman 1; Gillman 4, 5s and 7s; Glick 1, 3 and 5; Glick 2 and 4). Even in these logarithmic plots some skewness is apparent, with each plot showing an upper tail longer than the lower.
Moreover, whatever measure of ‘size’ is used, the range is vast – for example, there are firms whose cost of sales is measured in mere thousands of pounds alongside ones with cost
of sales in the hundreds or thousands of millions, in other words bigger by a factor of 106 or
Many such small firms are not genuine capitalist enterprises, but tax vehicles for
individual contractors or enterprises which employ only family members.37 If homes or cars
can be put in the firm’s name, for instance, its net assets could easily be of the order of
£105. As such, the profit and capital figures in the accounts may be driven by tax
considerations rather than a need to accurately measure rates of return on capital and we should not be surprised at very extreme values (perhaps especially not at highly negative ones, if individuals have incentives to under-report both income and assets). We will want to exclude these observations as contaminants.
However, there is also the question of entities which, although small, are genuinely capitalistic. In Chapter Three Section 3 we saw that the size of firms has a strong inverse relation to the range of profit rates which they experience. One possible explanation is that this is a case of mixture of distributions, with slippage of the scale parameter, and possibly others. But pursuing this line meets with two objections: first, we do not, in the nature of the case, have any information about what distributions might be mixed; second, if at all possible we would prefer a simpler model, in the sense of a single distributional law covering all cases. We will take up this theme again in Chapter Seven.
4.1.2.4 Proneness to outliers
One possible source of outliers is that the theoretical model is intrinsically prone to them. This is in fact the case with the gamma distribution (absolutely, but not relatively, as already noted).
However, even if appropriate tests showed that the right tail observations were not discordant with this hypothesis, we still have to account for the left tail: since the gamma and log normal distributions have bounded support, some part of this must be discordant,
37 In 1995, of the 54,285 enterprises which reported the number of employees, 433 reported just one; 1,720
unless we fix the threshold at or to the left of the leftmost observation – or unless we adopt some third model.
As noted in the introduction to this chapter, we will be adopting methods which are capable of automatically suggesting alternative candidates.