• No se han encontrado resultados

II. La IED y sus disposiciones ambientales en el

4. Regulaciones ambientales a la IED en los TLC

So far we have considered the distribution of profit rate measures across the population of firms, and this would be appropriate in testing Gibrat’s log-normal hypothesis. But a test of Farjoun and Machover’s gamma distribution hypothesis requires us to estimate the rate of return achieved by each unit of capital (which we take to mean each £1 of capital invested).

In principle we could look at the population distribution directly; we could take each

observed value for a given profit rate measure, look at the size ki of the firm concerned, as

measured by the appropriate capital definition, and replicate the particular rate of return a

further ki−1 times. But this would be computationally intractable, given the enormous

number of observations that would result – one for each of more than 1,284bn £1 units of capital, in the case of Gillman 1, for example.

Thus some method of sampling will be preferable, which immediately raises the question of how big a sample to take. This is a difficulty. For example, the standard goodness-of-fit

test is the Kolmogorov-Smirnov (KS).41 This essentially compares the empirical cumulative

distribution function to that of the of the estimated model and assesses whether the

41 The Kolmogorov-Smirnov test is possibly the best-known, but alternatives are the Cramér-von Mises and

maximum vertical distance between the two can be thought of as arising by chance. Informally, the more data one has, the greater the likelihood of noise giving rise to rejection of what might otherwise be a well-fitting model. However, one clearly cannot choose the size of sample with an eye on whether it makes it more (or less) likely that any particular model will pass the KS test. Happily, it turns out that this invidious choice can be avoided, thanks to a scheme suggested by Chris Jones (personal communication).

Suppose that the profit-rates pi of each of i firms, according to a given profit rate

measure, are entered once each in a draw, in which the probability of being chosen is

max

k

k , where k is the size of the i-th firm according to the capital measure used in thei

profit rate measure, and kmax is the size of the largest firm. Clearly the largest firm is

guaranteed to have its profit rate pmax selected, while the probability of other observed

profit-rates being drawn will be in accordance with the relevant firms’ size compared to that of the largest. However, the total number of profit-rates drawn will be random – we will have a randomly-sized random sample (RS2).

Such an RS2 is an unbiased estimator of the overall population and hence the sample value of any statistic of interest should be an unbiased estimator of the population statistic.

To see this, consider conducting m such draws and concatenating the results: pmax will

naturally appear exactly m times, while the remaining pi will tend to appear mk kmax

times. As m → kmax each profit rate will, in probability, appear ki times, and pmax exactly kmax

times, giving a close approximation to the distribution of profit-rates across the underlying population of capital units (for example, £1s), randomly divided among the m samples. The individual sample statistics will of course be subject to variance but, given standard assumptions about their distribution converging to the Gaussian, averaging the sample values should give a result close to the population value. We will revisit this assumption in Chapter Five.

We could, of course, actually take kmax RS2s, estimate any desired statistic as just

suggested, and cross-check against the value for the simulated population. However, given

unfeasible amounts of computation. The question is thus how few RS2s can we take while securing adequate protection against variance. The procedure outlined is akin to the bootstrap procedure (Efron and Tibshirani, 1993), and a rule of thumb for the bootstrap is that for datasets of ‘reasonable’ size, 100 replications should be enough.

It turns out that because of the extreme range and skewness of ki under all the profit rate

measures (the vast majority of firms are very small), the RS2s drawn as described are rather sparse (in two cases – Glick 2 and 4 – sufficiently sparse that it is not guaranteed that a

given RS2 will be large enough to calculate τ4; in other words, some of these samples are

smaller than four). The solution, however, is simple: treat the concatenation of the m samples as a single larger RS2 sample and calculate the desired statistic, replicate the modified RS2 procedure n times, and take an average of the statistics. In our procedure n = m = 100 , and thus each profit rate estimate involves 10,000 separate random samples.

We note that the above also implements the fuzzy rejection strategy we suggested for coping with contaminants in a firm-level test.

Documento similar