Testing the Existence of Clustering in the Extreme Values

(1)

Testing the Existence of Clustering in the Extreme Values

Jose Olmo

^∗

Dept. of Economics U. Carlos III de Madrid

This version, October 2004

Abstract

The extremal index is the key parameter for extending extreme value theory for iid random variables to stationary processes, reflecting the level of dependence existent in the largest observations defined by a threshold sequence {un}. This paper introduces an estimator for this parameter in two steps. First, it proposes a partition of the sample in different blocks that permits one to construct the point process of the maxima over these blocks that exceed {un}. Second, the sequence {vn} with vn > un, defines a thinning of this point process with the property that the mean cluster size of the stationary sequence defined by {vn} and the corresponding partition is 1. The estimator is defined by the ratio of the number of elements of both point processes. It is asymptotically unbiased, in contrast to standard methods in the literature that require some conditions on {un}, is also consistent and the central limit theorem can be applied. Therefore it supports a hypothesis test for the extremal index, and hence for testing the existence of clustering in the extreme values. Other advantages of this method are that it allows some freedom to choose {un}, and it is not very sensitive to the choice of the partition. Finally we carry out a simulation experiment for some known examples, and study the clustering of the extreme values of the Dax Index returns and its squares (clustering in volatility).

Keywords: Extremal Index, Extreme Value Theory, GARCH processes, Poisson Processes.

∗Address for correspondence: Universidad Carlos III de Madrid, C/ Madrid 126, 28903, Getafe (Madrid). E- mail: [email protected]. Financial support DGCYT Grant (SEC01-0890) is gratefully acknowledged. The author is deeply grateful to the Department of Statistics and Operations Research of UNC at Chapel Hill, especially to Ross Leadbetter and Francisco Chamu. The author also thanks the seminar participants of the 3^rdSymposium on Extreme Value Analysis: Theory and Practice celebrated in Aveiro (Portugal) as well as its scientific committee for the C.E.A.U.L. prize for young researchers. The software developed for this paper is implemented in Matlab 6.1.

and may be downloaded from the author home page wwww.unc.edu/home/olmo.

(2)

1 Background

Suppose a random sample from an unknown distribution function F, and let G be the limiting distribution of the sample maximum Mn. Classical Extreme Value Theory shows that under some regularity conditions on the tail of F and for some suitable constants an> 0, bn,

P {a⁻¹_n (Mn− bn) ≤ x} → G(x), (1)

where G must be of the following types (see de Haan (1976)),

Type I: (Gumbel) G(x) = e^−e^−x, −∞ < x < ∞.

Type II: (Fr´echet) G(x) =





0 x ≤ 0,

e^−x^{− 1}^ξ x > 0, ξ > 0.

Type III: (Weibull) G(x) =





1 x ≥ 0,

e^−(−x)^{− 1}^ξ x < 0, ξ < 0.

This important result may be extended to study the maximum of a wide class of dependent processes. We concentrate here on stationary sequences where the dependence is restricted by different distributional mixing conditions. We distinguish two types of dependence: long range and short range dependence. To limit the first type of dependence we assume the distributional mixing condition D(un) of Leadbetter, Lindgren and Rootz´en (1983). This condition is said to hold for a sequence {un} if for any integers 1 ≤ i1< . . . < ip< j1< . . . < jp⁰ ≤ n for which j1− ip≥ l, we have

D(un) : |Fi1,...,ip,j1,...,j_p0(un) − Fi1,...,ip(un)Fj1,...,j_p0(un)| ≤ αn,l, (2)

where αn,ln→ 0 as n → ∞ for some ln = o(n), and Fi1,...,ip(un) denotes P {Xi1 ≤ un, . . . , Xip≤ un}. Note that this condition only concerns events of the form {Xi≤ un} in contrast to more restrictive mixing conditions, for example strong mixing introduced by Rosenblatt (1956).

The condition D(un) alone is sufficient to extend the central result about the limiting distribution of the maximum to stationary sequences for some suitable constants not necessarily the ones obtained from the iid context. In particular these constants a_n > 0, b_n and the ex- treme value distribution G are the same of the iid case under a condition, D⁰(u_n) restricting short range dependence (avoids the presence of clusters),

D⁰(un) : limsup

n→∞ n

[n/kXn] j=2

P {X1> un, Xj> un} → 0 as kn → ∞, (3)

(3)

with kn a sequence that defines a partition of the sample.

Otherwise, for a stationary sequence {Xi} satisfying only D(un) with un = anx + bn, we typically have

P {an(Mn− bn) ≤ x} → G^θ(x), (4)

where θ is the key parameter for extending extreme value theory for iid random processes to stationary sequences. This concept, originated in papers by Loynes (1965), O’Brien (1974) and developed in detail by Leadbetter (1983), reflects the effect of the clustering of the observations exceeding un on the limiting distribution of the maximum.

There are different interpretations of the extremal index θ, concerning diverse features of the clustering of the largest observations. Under different mixing conditions Loynes (1965) found that

P {Mn≤ un} = F^nθ(un), (5)

with F the marginal distribution and Mn the sample maximum of the stationary sequence.

O’Brien (1987) showed that

P {M2,rn≤ un|X1> un} → θ, (6)

where M2,rn is the maximum of {X2, . . . , Xrn}, with rn = o(n) and satisfying certain other growth conditions. Note that from this definition of the extremal index it is straightforward to see that 0 ≤ θ ≤ 1. Alternatively Leadbetter (1983) showed that the inverse of the extremal index is the limiting mean number of exceedances of un in an interval of length rn, i.e.

E





rn

X

j=1

I(Xj> un)|

rn

X

j=1

I(Xj > un) ≥ 1



 → θ⁻¹, (7)

with I(X > 0) the indicator function. By stationarity this is called the limiting mean cluster size of the process. Finally, Hsing (1993) and Ferro and Segers (2003) take advantage of

P {Mn≤ un} → e^−θτ, (8)

with 0 < τ < ∞, in two different ways. Hsing approximates the distribution of n(1 − F (Mn)) by an exponential distribution with mean θ⁻¹, and Ferro and Segers model the process of the interexceedance times defined by un by the same limiting exponential distribution.

Expression (8) may be regarded as an extension of the distribution of the maximum in the iid case, that is, (1) may be written as P {Mn ≤ un} → e^−τ with τ the exponent of an extreme value distribution and un = anx + bn. From this expression, by taking logs, it

(4)

is immediate to derive that n(1 − F (un)) → τ with 0 < τ < ∞. Then for iid sequences, Bn^(uⁿ⁾= Pⁿ

j=1

I(Xj> un) converges in distribution to a Poisson random variable with mean τ . However for dependent stationary sequences where D⁰(un) is not satisfied Bn^(uⁿ⁾ does not converge to a Poisson random variable (the exceedances of un are not mutually independent);

nevertheless we can define a point process as a result of thinning Bn^(uⁿ⁾ that converges to a Poisson process with intensity θτ , see Leadbetter (1983) or Leadbetter, Lindgren and Rootz´en (1983).

This paper presents an alternative derivation of the extremal index as the result of thinning twice Bn^(uⁿ⁾. The first thinning defines the process N_k^(u_nⁿ⁾formed by the maxima over knblocks of length rn and exceeding un. This process converges to a Poisson process N with mean θτ . The second thinning of Bn^(uⁿ⁾, and hence thinning of N_k^(uⁿ⁾

n , defines another point process N_k^(vⁿ⁾

n that converges in distribution to a Poisson process with intensity θ²τ . The sequence {vn} satisfies n(1 − F (vn)) → θτ and is defined by E[P^rⁿ

j=1

I(Xj> vn)|P^rⁿ

j=1

I(Xj > un) ≥ 1] → 1.

Under some mild conditions on the threshold sequence, this method provides a consistent estimator of the extremal index that outperforms most of the popular estimators and such that it is not very sensitive to the choice of the block size rn nor the choice of the sequence {un} in contrast to the rest of the candidates that estimate θ.

The paper is structured as follows. Section 2 introduces a definition of the extremal index founded on the grounds of point processes, in particular the point processes derived from the asymptotic distribution of the maximum. A natural estimator for this parameter based on these techniques is introduced in Section 3. This section also reviews some of the most popular estimators existent in the literature and their statistical properties, in particular bias and variance. The corresponding properties of our estimator are also studied with special emphasis in the analysis of the mean square error compared to the rest of the methods.

The optimal block size selection is considered regarding its impact over these estimators.

Finally this section presents a hypothesis test for the extremal index that is sufficient to test the existence of clustering in the extremes. A simulation experiment for different examples presented in the literature is conducted in Section 4 stressing a Monte-Carlo experiment for the mean square error. Section 5 presents an application to DaX Index returns in order to gain some understanding about the clustering in the extremes and in the volatility of the process.

Finally the conclusions are found in Section 6.

(5)

2 Definition of the extremal index

Suppose throughout that we have n observations from a stationary sequence {Xi, i ≥ 1} with marginal distribution function (d.f.) F satisfying [1 − F (x)]/[1 − F (x⁻)] → 1 as x → ∞. This condition is sufficient to define a sequence {un} for each 0 < λ < ∞ such that

n(1 − F (un)) → λ. (9)

Consider from now on that {X_n} satisfies D(u_n), as defined in (2), for each λ > 0.

Intuitively this sequence gives a measure of the degree of dependence in the process and permits the construction of almost independent blocks by the definition of sequences {k_n}, {rn} with kn → ∞, kn = o(n) and knln = o(n), while rn is the integer part of n/kn. The interpretation of these sequences is: kn, is the number of blocks of the sequence of length n, and rn the size of each block.

Under these assumptions, if P {Mn ≤ un} converges for some λ > 0 then

P {Mn≤ un} → e^−θλ, (10)

for all λ > 0, with 0 ≤ θ ≤ 1 (see theorem 3.7.1. of Leadbetter, Lindgren and Rootz´en (1983) for a detailed proof). The parameter θ is called the extremal index of the sequence {Xi} and is the key parameter for extending extreme value theory from iid random variables to stationary processes.

Consider the sequences {kn}, {rn} that define a suitable partition, then a sufficient condition for the existence of the extremal index is

kn(1 − F1,...,rn(un)) → θλ. (11)

This result is immediate by the approximation of P {Mn≤ un} by P^kⁿ{Mrn≤ un} for suitable choices of knand rn, and (10) and the linear polynomial expansion of the exponential function.

The converse of this result is also true, i.e. a stationary sequence with extremal index θ satisfies (11) for each λ > 0. The proof is obtained by taking logs in the expression P^kⁿ{M_r_n ≤ u_n} that approximates e^−λ.

Consider the number of exceedances of u_n within a block of size r_n. This event defines a sequence of random variables Br^(unⁿ⁾ = P^rⁿ

j=1

I(X_j > u_n) for r_n → ∞, and r_n = o(n) whose expected value, by the stationarity of the process, converges to the mean cluster size of the exceedances of un in the sequence {Xi}, that is, the inverse of the extremal index,

Eh

Br^(unⁿ⁾|Br^(unⁿ⁾≥ 1i

→ θ⁻¹.

(6)

This is readily seen since E[Br^(unⁿ⁾|Br^(unⁿ⁾≥ 1] = P^∞

j=1

jP {Br^(unⁿ⁾= j|Br^(unⁿ⁾≥ 1} = ^rⁿ^{P {X}^j^>uⁿ^}

P {^rnS

j=1

(Xj>un)}, and therefore E[Br^(unⁿ⁾|Br^(unⁿ⁾≥ 1] = _1−F^rⁿ^{(1−F (u}ⁿ⁾⁾

1,...,rn(un)→ θ⁻¹, if (11) holds.

The same argument may be applied to define a process B^(vrnⁿ⁾ with vn≥ un satisfying

E h

B_r^(v_nⁿ⁾|B^(u_r_nⁿ⁾≥ 1 i

→ 1. (12)

It is of interest to note that the sequence {vn} satisfies condition D(vn) since vn≥ un and

n(1 − F (vn)) → θλ as n → ∞. (13)

In addition, by the structure of dependence (see (9) and (10)) we have P {Mn≤ vn} → e^−θ²^λ. It is immediate now to see that (11) holds for the sequence {vn} by

kn(1 − F1,...,rn(vn)) → θ²λ. (14)

The event {Xi > un} and the sequences {kn}, {rn} divide the sequence {Xn}, with ex- tremal index θ in approximately independent groups of exceedances of unwhere M(j−1)rn+1,jrn

is the block maxima for j = 1, . . . , kn. It is clear that the sequence {M(j−1)r_n+1,jr_n} is ap- proximately serially independent as n increases if D(un) holds for {Xn}.

Consider the points j as points in time and define for each n, and kn, a process ηkn(j/kn) = M_(j−1)r_n_+1,jr_n. The time scale is normalized t = j/kn on the unit interval (0, 1]. Then the exceedances of un by the process ηkn(t) define a point process N_k^(u_nⁿ⁾ on the unit interval (see Kallenberg (1976) for the theory of point processes). Moreover, the point process N_k^(u_nⁿ⁾ converges in distribution to a Poisson process N on (0, 1] with intensity parameter θλ. To prove this result it is only necessary to show that E[Nkn(a, b]] → E[N (a, b]] for 0 < a < b ≤ 1 and P {N_k^(u_nⁿ⁾(A) = 0} → P {N (A) = 0} for each finite disjoint union A of sets (ai, bi] ⊂ (0, 1].

The expected value of the point process over an interval (a, b] ⊂ (0, 1] converges to (b−a)θλ , i.e.

Eh N_k^(uⁿ⁾

n (a, b]i

= (b − a)knP {M1,rn> un} → (b − a)θλ, (15)

by (11), since {Xn} has extremal index θ. Consider for the second part of the proof the set A = S^p

i=1

(ai, bi] covering the interval (0, 1] and write Ai for the integers in (knai, knbi]. Then the proof of the second condition can be summarized in two main facts: First,

P {T^p

i=1

(M (Ai) ≤ un)} − Q^p

i=1

P {M (Ai) ≤ un} → 0,

if D(un) holds with M (Ai) the maximum of the elements of the sequence {M(j−1)rn+1,jrn} with integers j in the set of integers Ai. And second,

(7)

P {N_k^(u_nⁿ⁾(A) = 0} → Q^p

i=1

P {N (ai, bi] = 0} = P {N (A) = 0},

by the properties of the Poisson process for non-overlapping intervals (ai, bi], i.e. P {N (ai, bi] = 0} → e^−θλ(bⁱ^−aⁱ⁾. A more detailed proof of a similar result is found in Leadbetter (1983).

It is interesting to see that the same argument may be applied to construct a thinning of N_k^(u_nⁿ⁾by a sequence {vn} satisfying (12). This sequence defines the point process N_k^(v_nⁿ⁾ on the unit interval that converges to a Poisson process with intensity measure θ²λ. The proof is analog to the case N_k^(u_nⁿ⁾since (14) and D(vn) hold with vn ≥ un.

These results provide a setting defined by the point processes N_k^(u_nⁿ⁾ and N_k^(v_nⁿ⁾ on (0, 1]

very useful to give an alternative definition for the extremal index on the grounds of the point processes theory,

θ = lim

n→∞

E[N_k^(v_nⁿ⁾]

E[N_k^(u_nⁿ⁾]. (16)

There is another interpretation for the extremal index obtained from the results for the marginal distribution F given in (9) and (13).

θ = 1 − lim

n→∞Fun(vn), (17)

with Fun(vn) = ^{F (v}_{1−F (u}ⁿ^{)−F (u}ⁿ⁾

n) the conditional excess distribution function defined by un. It is clear that as the dependence in the stationary sequence decreases, vn approaches un and θ gets closer to one as for the iid case or for weak dependence (D(un) and D⁰(un) hold).

These definitions of the extremal index are also valid for threshold sequences where (9) does not hold but the mixing condition in (2) still holds. Consider eunsuch that n(1−F (eun)) = λn, with λ_n→ ∞, and λ_n= o(n). This condition implies that P {M_n≤ eu_n} → 0.

A necessary condition for eu_n in order to define the extremal index is that the ratio

P {Mn≤eun}

n(1−F (eun)) converges to a constant in (0, 1). If the sequence {X_n} has extremal index θ conditions (9) and (11) are satisfied for certain sequence un. Then, a sufficient condition for e

un is that

(1 − F (eun))(1 − F1,...,r_n(un))

(1 − F (un))(1 − F1,...,rn(eun))→ 1. (18) This condition entails this, kn(1 − F1,...,rn(eun)) = λ⁰_n with λ⁰_n → ∞ and λ⁰_n/λn → θ. The same results that for un and λ constant are achieved now for eun and λn. Therefore, the sequence Brn(eun) satisfies that

E h

Br^(e^unⁿ⁾|Br^(e^unⁿ⁾≥ 1 i

→ θ⁻¹,

and there exists a sequence evnsuch that n(1 − F (evn)) = λ⁰_n. Moreover if this sequence satisfies (18) (replacing eun by evn), kn(1 − F1,...,rn(evn)) = λ⁰⁰_n, with λ⁰⁰_n→ ∞ and λ⁰⁰_n/λ⁰_n → θ .

(8)

The extremal index can be defined by means of sequences eun and evn, with evn > eun and such that D(eun) holds. Then

θ = lim

n→∞

E[Z_e_v^∗

n]

E[Z_u^∗_e_n], (19)

with Z_u^∗_e_n and Z_e_v^∗_n the number of blocks of the partition defined by {kn}, {rn} in which there is at least one exceedance of eun and evn respectively.

3 Estimation of the extremal index

The extremal index represents the clustering of the largest observations determined by a sufficiently high sequence {un}, satisfying a condition of type (9). This serial dependence has an effect on the distribution of the maximum of the stationary sequence, that is, P {Mn≤ un} is F^nθ(un) instead of Fⁿ(un) for n and un sufficiently large.

This result leads to the first estimator of the extremal index for appropriate sequences kn, rn satisfying that P^kⁿ{Mrn≤ un} approximates P {Mn≤ un}. Then, by taking logs in both expressions, θ = ^{logP {M}_r ^rn^≤uⁿ^}

nlogF (un) . A natural estimator for the extremal index is in this case, θˆ_n⁽¹⁾= log(1 − Z_u^∗_n/kn)

rnlog(1 − Zun/n), (20)

with Zun the number of exceedances of un by {Xn}. The ratio Zun/n is an estimator of 1 − F (un), and Z_u^∗_n/kn an estimator of 1 − F1,...,rn(un).

On the other hand the concept of extremal index introduced by Leadbetter (1983), θ⁻¹ the limiting mean cluster size of the exceedances, yields this estimator

θˆ_n⁽²⁾= Z_u^∗_n Zun

. (21)

This method is called the blocks method and may be considered a simplified version of ˆθ⁽¹⁾n . Another popular method is the runs estimator, this method may be seen as the estimator of the extremal index for the definitions introduced in O’Brien (1987) or in Hsing (1993),

θn= Wun

Zun

, (22)

where Wun =^n−rPⁿ

i=1

I(Xi> un)(1 − I(Xi+1> un)) · ·(1 − I(Xi+rn > un)).

Our definition for the extremal index yields an appealing estimator of θ that inherits the limiting properties of the Poisson processes. Let un and vn be the sequences satisfying (9)

(9)

and (13), then our estimator is

θen= N_k^(v_nⁿ⁾

N_k^(u_nⁿ⁾, (23)

with N_k^(uⁿ⁾

n , N_k^(vⁿ⁾

n the number of exceedances of u_n and v_n by the process η_k_n(0, 1]. Note that this estimator may be written as eθ_n= Z_v^∗_n/Z_u^∗_n representing the corresponding thinnings defined by the sequence k_n and the thresholds u_n and v_n. The estimator, however, is not fully specified since these sequences are not determined. By condition (2) adequate estimators of the sequence un are given by extreme order statistics. Let ˆun be an estimator of un; then the sequence vn is estimated by the order statistic of the stationary sequence {Xi} satisfying the empirical counterpart of (12), i.e.

ˆ

vn= max

1≤i≤n



xi, i = 1, . . . , n| 1 Z_u^∗_ˆ_n

kn

X

j=1

B^(x_r_nⁱ_,j⁾= 1



, (24)

with B_r^(ˆ^u_nⁿ_,j⁾ = ^jrPⁿ

k=(j−1)rn+1

I(Xk > ˆun). This expression boils down to ˆvn = x(n−Z_un^∗_ˆ ) with x(1) ≤ . . . ≤ x(n) the sequence of order statistics. The extremal index is then estimated by

eθn= Z_v^∗_ˆ_n Z_ˆ_u^∗

n

. (25)

The usual estimators for θ as well as this method rely on the Poisson character of the corresponding point processes, see conditions (9) and (11). By this property, the threshold sequence un appearing in these expressions is estimated by an extreme order statistic (see section 2.5. in Leadbetter, Lindgren and Rootz´en (1983)). This property also implies that the variance of the number of exceedances of un remains constant. This is a serious inconvenient for the consistency of these estimators.

To overcome this, Hsing (1991) considers an estimator as (21) where the sequence un

satisfies a condition of type (9) but including a sequence cn→ ∞ such that n(1 − F (un)) → cnλ. Note that in this situation the threshold sequence is an increasing rank, the variance of the corresponding estimator of θ decreases as n → ∞, and the central limit theorem applies.

This condition results in a consistent estimator for θ. On the other hand Smith and Weissman (1994) consider a sequence un where (9) holds with λ → ∞ to obtain the consistency of the usual estimators of θ.

Nevertheless any increasing rank does not succeed in yielding consistency of the estimators.

Moreover, a wrong choice of un can result in erroneous conclusions about the behavior of the usual estimators. For example suppose the threshold sequence is estimated by a central rank, that is, ^Z^un_n = ν < 1, with ν constant, and Zu_n → ∞ as n → ∞. By definition, kn = o(n)

(10)

that implies kn = o(Zun). This in turn yields ˆθn⁽¹⁾, ˆθ⁽²⁾n → 0 because ^Z^un_n is constant and Z_u^∗_n= o(Zun) respectively. Therefore for these estimators to be consistent it is necessary that Zun→ ∞, that amounts to λ → ∞, but with the threshold sequence defined by an intermediate rank Zun = o(n). In contrast, the estimator introduced in (25) lets more freedom to select un, the only constraint is given by condition (18). This is because though Z_u^∗_n= o(Zun) and Z_v^∗_n= o(Zun), the estimator is by construction of {vn} such that _Z^Z^vn∗^∗

un → θ.

3.1 Statistical properties of the different estimators

The estimator eθn may be regarded as the quotient of the number of elements of two point processes N_k^(v_nⁿ⁾, and N_k^(u_nⁿ⁾ where vn and un satisfy (13) and (9) respectively.

θen=

knP

j=1

I(M(j−1)rn+1,jrn>ˆvn)

knP

j=1

I(M(j−1)rn+1,jrn>ˆu_n)

.

Alternatively, this estimator may be expressed as the ratio of the number of blocks defined by the sequences {kn}, {rn}, and un and vn,

θen= ^Z_Z^vn^∗∗ un.

By the second order Taylor expansion of E[Z_v^∗_n/Z_u^∗_n] about the respective expected values (delta method) we have that

E[eθn] = E[Z_v^∗_n] E[Z_u^∗_n]

µ

1 + V [Z_u^∗_n]

E[Z_u^∗_n]² −Cov(Z_v^∗_n, Z_u^∗_n) E[Z_u^∗_n]E[Z_v^∗_n]

¶

. (26)

For finite samples the dependence in the largest observations is taken into account, that is, E[Z_u^∗2_n] = E[Z_u^∗_n] + 2P^kⁿ

i=1 kn

P

j>i

P {Mi > un, Mj > un}, where for ease of notation Mi is the maximum over the block {X_(i−1)r_n_+1,...,X_irn}. Under condition D(un), the difference between k²_nP {Mi > un, Mj > un} and E²[Z_u^∗_n] converges to 0 as n increases, and knP {Mi > un, Mj >

un} with i 6= j is well approximated by E[Z_u^∗_n]P {Mi> un} that in turn also converges to 0.

Therefore expression (26) for n sufficiently high is as follows

E[eθn] = E[Z_v^∗_n] E[Z_u^∗_n]

µ

1 + E[Z_u^∗_n]

E[Z_u^∗_n]²− E[Z_v^∗_n] E[Z_u^∗_n]E[Z_v^∗_n]

¶ + o(1

λ),

and it is immediate to see that the expected value of our estimator takes this expression,

E[eθn] = θ + o(1

λ). (27)

For the analysis of the variance it is useful to derive the conditional moments. Consider Z_u^∗_n is known. Then, by (14), E[eθn|Z_u^∗_ˆ_n] = ^θ²^λ+o(λ)_Z∗

un , and the conditional variance takes this

(11)

form,

V [eθn|Z_u^∗_n] =

E[Z_v^∗_n] + 2P^kⁿ

i=1 kn

P

j>i

P {Mi> vn, Mj> vn} − E[Z_v^∗_n]²

Z_u^∗2_n . (28)

By Leadbetter’s mixing condition, the conditional variance is

V [eθn|Z_u^∗_n] = θ²λ + o(λ)

Z_u^∗2_n . (29)

The unconditional variance can be decomposed in two different terms, V [eθn] = V [E[eθn|Z_u^∗_n]]+

E[V [eθn|Z_u^∗_n]], (law of the iterated expectations). By the Taylor second order expansion of E[1/Z_u^∗_n] and E[1/Z_u^∗2_n] about E[Z_u^∗_n] we obtain that

V [E[eθn|Z_u^∗_n]] = [θ²λ + o(λ)]²

·V (Z_u^∗_n) E(Z_u^∗_n)⁴ + o

µ 1 λ³

¶¸

= θ λ+ o

µ1 λ

¶

, (30)

and

E[V [eθn|Z_u^∗_n]] = (θ²λ + o(λ))

· 1

(λ⁰+ o(λ))² + o µ 1

λ²

¶¸

= O(1

λ). (31)

In consequence,

V [eθn] = θ + 1 λ + o

µ1 λ

¶

= O µ1

λ

¶

. (32)

Therefore the mean square error (MSE) of our estimator is of order O(_λ¹). This result implies that this estimator is not consistent for un and vn defined by extreme order statistics.

The consistency of the estimator is achieved when these sequences are replaced by eun and evn

defined by intermediate order statistics as it is shown in the following section.

Our estimator may be interpreted as a refinement of the standard blocks method ˆθ⁽²⁾n by writing eθn = ^Z^∗^vn_ˆ^/Z^un

θ⁽²⁾_n . The asymptotic properties of the latter estimator ˆθ⁽²⁾n are derived in Hsing (1991) or in Smith and Weissman (1994). By means of the delta method they find that E[ˆθ⁽²⁾n ] = θ + O(_λ¹), and the variance is V [ˆθ⁽²⁾_n ] = O(1

λ). Note that the bias of this estimator is of higher order than the bias of eθ_n. However, the mean square error (MSE) of both estimators is of order O(1/λ).

For the logs method,

E[ˆθ⁽¹⁾_n ] = E[Z_u^∗_n] E[Zun]

µ

1 +E[Z_u^∗_n] 2kn

+E²[Z_u^∗_n] 6k²_n

¶

= θ + O µλ

kn

¶ .

Note that the choice of the sequence kn turns of paramount importance to determine the bias and the mean square error of this estimator.

(12)

3.2 Inference for the Extremal Index

Consider now the sequences eun and evn defined by the conditions λ⁰_n = kn(1 − F1,...,rn(eun)), λ⁰⁰_n= kn(1 − F1,...,rn(evn)), with λ⁰_n→ ∞, λ⁰⁰_n → ∞ and λ⁰⁰_n/λ⁰_n→ θ. In this case the first two moments of Z_u^∗_e_n and Z_e_v^∗_n do not converge to a constant, but they diverge towards infinity.

The variance is given by this expression, V [Z_u^∗_e

n] = E[Z_e_u^∗

n] + ÃPkn

i=1 kn

P

j=1

P {M_i> eu_n, M_j> eu_n} − E²[Z_u^∗_e

n]

!

− k_nP {M_i> eu_n, M_j> eu_n}, for some i 6= j. Note that in this case, under D(un), for n sufficiently high the variance is

V [Z_u^∗_e_n] = E[Z_e_u^∗_n] − E[Z_e_u^∗_n]P {Mj> eun}.

The covariance in turn takes this expression,

Cov[Z_u_e^∗_n, Z_v^∗_e_n] = E[Z_e_v^∗_n] − E[Z_e_v^∗_n]P {Mj > eun}.

Therefore expression (26) is as follows

E[eθn] = λ⁰⁰_n λ⁰_n

µ

1 +λ⁰_nP {Mj≤ eun}

(λ⁰_n)² −λ⁰⁰_nP {Mj ≤ eun} λ⁰_nλ⁰⁰_n

¶ + o

µ 1 λn

¶

, (33)

that boils down to E[eθn] = θ + o

³ 1 λn

´

, by the definition of λ⁰_n and λ⁰⁰_n.

This estimator of θ is asymptotically unbiased, and therefore outperforms ˆθ⁽²⁾n in this sense.

Furthermore, if λ²_n > kn the bias of eθn is of smaller order than the bias of the estimator of the logs method.

In this setting the distributions of Z_u^∗_e_n and Z_e_v^∗_n are well approximated by the normal distribution for n sufficiently high. In particular,

Z_u^∗_e_n∼ N (λ^w ⁰_n, λ⁰_nF1,...,rn(eun)) . (34)

In the same way,

Z_e_v^∗_n ∼ N (λ^w ⁰⁰_n, λ⁰⁰_nF_1,...,r_n(ev_n)) . (35)

On the other hand the sequences eun and evn are related by this expression,

1 − F1,...,rn(evn) = [1 − F1,...,rn(eun)]

·

1 −F1,...,rn(evn) − F1,...,rn(eun) 1 − F1,...,r_n(eun)

¸

. (36)

Define the random variable Z_e_v^∗

n|Z_e_u^∗

n, that is, λ⁰_nis given. Its expected value is E[Z_e_v^∗

n|Z_u^∗_e

n] = λ⁰_nh

1 − ^F^1,...,rn_1−F^(e^vⁿ^)−F^1,...,rn^(e^uⁿ⁾

1,...,rn(eun)

i

that by (36) is E[Z_e_v^∗

n|Z_u_e^∗

n] = λ⁰⁰_n.