• No se han encontrado resultados

Consider a stationary time series (Xt) with marginal distribution F , for which we want

to derive probabilities of events that are more extreme than the observations recorded so far. The univariate framework of Section 2.1.1 is directly applicable to this problem if we can reasonably assume the Xtto be independent. In practice, this is rarely the case, and departures

from independence have been explored in the literature.

Leadbetter (1974) showed that a weak mixing condition suffices for Theorem 2.1 to hold for the stationary sequence (Xt).

Definition 2.2 (D(un) condition)

The stationary sequence (Xt) satisfies the D(un) condition for a sequence (un), if for each n,l,

and each choice of sequences 1 ≤ i1< · · · < ipand j1< · · · < jq≤ n with j1− ip≥ l , we have

|Fi1,...,ip,j1,...,jq(un,...,un) − Fi1,...,ip(un,...,un)Fj1,...,jq(un,...,un)| ≤ εn,l,

withεn,l→ 0 as n → ∞ and l = ln= o(n).

The D(un) condition ensures that long-range dependence remains small, but does not prevent

clustering of extreme values. The independence assumption in Theorem 2.1 can be replaced by the much weaker assumption that the Xtsatisfy the D(un) condition for all sequences (un)

with un= bnx + an.

Consider the series (Xt?) of independent variables with the same marginal distribution F , and denote Mn?= max{X1?,..., Xn?}. A strong link connects the limiting distribution of Mn?

with that of Mnunder the D(un) condition (Leadbetter, 1983).

Theorem 2.2 (Extremal index)

Let the D(un) condition hold for the stationary sequence (Xt). Then there exist sequences (a?n),

(b?n) > 0, and a non-degenerate distribution function G?such that Pr µM? n− an? bn? ≤ x→ G?(x), n → ∞,

2.2 Extremes in time series 13 if and only if there exist sequences (an), (bn) > 0, and a non-degenerate distribution function G

such that Pr µM n− an bn ≤ x→ G(x), n → ∞.

In addition, we have a?n= an, b?n= bnand G?(x) = {G(x)}θ, withθ ∈ [0,1] named the extremal

index.

As this result shows, the extremal index plays a key role when modelling time series with short-range dependence, and it will be further detailed in Section 2.3. Independent times series haveθ = 1, but this case appears also in time series with some dependence, under a condition on short-range dependence of (Xt) (Leadbetter, 1974; Leadbetter et al., 1983,

Chap. 3).

Definition 2.3 (D0(un) condition)

The stationary sequence (Xt) staisfies the D0(un) condition for a sequence (un) if

lim k→∞limsupn→∞ ( nbn/kcX t=2 Pr(X1> un, Xt> un) ) = 0.

While the D(un) condition ensures that long-range dependence decreases at an appropriate

rate, the D0(un) condition ensures that short-range dependence remains sufficiently low, by

imposing that two observations have a small probability of exceeding unin any block of length

n.

In practice, the D(un) and D0(un) conditions are hard to verify, but they support the

approach of making inference on time series even when weak dependence is observed at finite levels. A typical approach is maximum likelihood estimation based on the conditional distribution (2.6). Other methods for estimating the generalised Pareto distribution in this context are reviewed in Kotz and Nadarajah (2000, Chap. 1). When short-range dependence in (Xt) cannot be ignored, inference typically involves a pre-processing step that retains only

observations that can be considered independent (Davison and Smith, 1990), or requires inflation of standard errors for estimates derived from all exceedances of a high threshold (Fawcett and Walshaw, 2007, 2012).

The point process theory related to clustered stationary time series (Xt) (Hsing et al.,

1988; Leadbetter, 1995) describes the limit process of such series as a compound Poisson process (Daley and Vere-Jones, 2003, Chap. 2), where the marks are the cluster sizes. Assuming 1 − F (un) ∼ τ/n, n → ∞, τ > 0 and asymptotic cluster size distribution π, this compound

Poisson process has intensityθτ and mark size distribution π. This theory also establishes that the distribution of cluster maxima is asymptotically identical to the marginal distribution of all excesses (Pickands, 1975). Several declustering schemes exist, of which the most popular are based on blocks (Leadbetter et al., 1989) and on runs (Smith, 1989). The former partitions the series into nBblocks of the same length rBand picks the blocks with at least one exceedance

14 Chapter 2. Modelling extremes of unas clusters; the latter picks clusters of exceedances which are separated by at least rR

non-exceedances of un. Having defined clusters, peaks over threshold analysis is based on the

maximum value in each of them. Both methods involve the choice of an arbitrary quantity rBor rR, as well as the threshold un, which can affect subsequent inferences. An automatic

selection procedure is suggested by Ferro and Segers (2003) for a given un.

Fawcett and Walshaw (2007, 2012) showed, through extensive simulations and examples, that the pre-processing of dependent time series may yield badly-biased estimates of tail probabilities and return levels, which we shall define shortly. The authors suggest using all exceedances of a threshold and computing uncertainty by inflating standard errors with a sandwich method. Eastoe and Tawn (2012) take another approach, based on the idea that the distributions of cluster maxima and of marginal exceedances coincide only in the limit. Their model is a modified generalised Pareto distribution that better reflects the distribution of cluster maxima at subasymptotic levels, with the additional benefit of using the information contained in all exceedances of a threshold; this will be further developed in Chapter 5.

An important aspect in the field of extreme values is the communication of conclusions, for example about an estimate of Pr(X ∈ B), which may be a very small quantity, in a way that can be grasped by common sense and can serve as a basis for decisions. Return periods are defined such that risk assessment is expressed in terms of a time span instead of tiny probabilities. For a stationary series, we say that an extreme event of size xn has a return

period of n years if the probability of experiencing an event of size exceeding xnin a year is

1/n. The event size xnis called the n-year return level.

All the methods described so far in this section deal with the marginal distribution of Xt

but do not consider modelling the joint distribution in time. This is of particular interest when deriving functionals of extreme events, such as the duration of extreme events, e.g., the duration of drought (Winter and Tawn, 2016), the cumulated intensity of an event, e.g., the amount of rain over a period of extreme rainfall, the r th-largest statistic of a cluster, and many others (Yun, 2000; Segers, 2003).

A natural approach to fitting the joint distribution of a series is to assume a Markov property. Smith et al. (1997) were the first to consider this type of modelling for extremes, with a likelihood of the form

`(x1,..., xn;φ1,φ2) = f (x1;φ1)

n

Y

t=2

f (xt| xt−1;φ1,φ2),

whereφ1andφ2denote the parameters of the marginal and joint distributions respectively,

f (·) is the marginal density and f (· | x) is the conditional density given X = x. They use the alternative formulation Qn t=2f (xt, xt−1;φ1,φ2) Qn−1 t=2 f (xt;φ1) ,

2.3 Modelling asymptotic independence 15

Documento similar