Fisher–Tippett–Gnedenko theorem

From The Right Wiki
Jump to navigationJump to search

In statistics, the Fisher–Tippett–Gnedenko theorem (also the Fisher–Tippett theorem or the extreme value theorem) is a general result in extreme value theory regarding asymptotic distribution of extreme order statistics. The maximum of a sample of iid random variables after proper renormalization can only converge in distribution to one of only 3 possible distribution families: the Gumbel distribution, the Fréchet distribution, or the Weibull distribution. Credit for the extreme value theorem and its convergence details are given to Fréchet (1927),[1] Fisher and Tippett (1928),[2] Mises (1936),[3][4] and Gnedenko (1943).[5] The role of the extremal types theorem for maxima is similar to that of central limit theorem for averages, except that the central limit theorem applies to the average of a sample from any distribution with finite variance, while the Fisher–Tippet–Gnedenko theorem only states that if the distribution of a normalized maximum converges, then the limit has to be one of a particular class of distributions. It does not state that the distribution of the normalized maximum does converge.

Statement

Let X1,X2,,Xn be an n-sized sample of independent and identically-distributed random variables, each of whose cumulative distribution function is F. Suppose that there exist two sequences of real numbers an>0 and bn such that the following limits converge to a non-degenerate distribution function:

limn(max{X1,,Xn}bnanx)=G(x),

or equivalently:

limn(F(anx+bn))n=G(x).

In such circumstances, the limiting function G is the cumulative distribution function of a distribution belonging to either the Gumbel, the Fréchet, or the Weibull distribution family.[6] In other words, if the limit above converges, then up to a linear change of coordinates G(x) will assume either the form:[7]

Gγ(x)=exp((1+γx)1/γ)for γ0,

with the non-zero parameter γ also satisfying 1+γx>0 for every x value supported by F (for all values x for which F(x)0).[clarification needed] Otherwise it has the form:

G0(x)=exp(exp(x))for γ=0.

This is the cumulative distribution function of the generalized extreme value distribution (GEV) with extreme value index γ. The GEV distribution groups the Gumbel, Fréchet, and Weibull distributions into a single composite form.

Conditions of convergence

The Fisher–Tippett–Gnedenko theorem is a statement about the convergence of the limiting distribution G(x), above. The study of conditions for convergence of G to particular cases of the generalized extreme value distribution began with Mises (1936)[3][5][4] and was further developed by Gnedenko (1943).[5]

Let F be the distribution function of X, and X1,,Xn be some i.i.d. sample thereof.
Also let xmax be the population maximum: xmaxsup{xF(x)<1}.

The limiting distribution of the normalized sample maximum, given by G above, will then be:[7]

Fréchet distribution (γ>0)
For strictly positive γ>0, the limiting distribution converges if and only if
xmax=
and
limt1F(ut)1F(t)=u(1γ) for all u>0.
In this case, possible sequences that will satisfy the theorem conditions are
bn=0
and
an=F1(11n).
Strictly positive γ corresponds to what is called a heavy tailed distribution.
Gumbel distribution (γ=0)
For trivial γ=0, and with xmax either finite or infinite, the limiting distribution converges if and only if
limtxmax1F(t+ug~(t))1F(t)=eu for all u>0
with
g~(t)txmax(1F(s))ds1F(t).
Possible sequences here are
bn=F1(11n)
and
an=g~(F1(11n)).
Weibull distribution (γ<0)
For strictly negative γ<0 the limiting distribution converges if and only if
xmax< (is finite)
and
limt0+1F(xmaxut)1F(xmaxt)=u(1γ) for all u>0.
Note that for this case the exponential term 1γ is strictly positive, since γ is strictly negative.
Possible sequences here are
bn=xmax
and
an=xmaxF1(11n).

Note that the second formula (the Gumbel distribution) is the limit of the first (the Fréchet distribution) as γ goes to zero.

Examples

Fréchet distribution

The Cauchy distribution's density function is:

f(x)=1π2+x2,

and its cumulative distribution function is:

F(x)=12+1πarctan(xπ).

A little bit of calculus show that the right tail's cumulative distribution 1F(x) is asymptotic to 1x, or

lnF(x)1xasx,

so we have

ln(F(x)n)=nlnF(x)nx.

Thus we have

F(x)nexp(nx)

and letting uxn1 (and skipping some explanation)

limn(F(nu+n)n)=exp(11+u)=G1(u)

for any u.

Gumbel distribution

Let us take the normal distribution with cumulative distribution function

F(x)=12erfc(x2).

We have

lnF(x)exp(12x2)2πxasx

and thus

ln(F(x)n)=nlnF(x)nexp(12x2)2πxasx.

Hence we have

F(x)nexp(nexp(12x2)2πx).

If we define cn as the value that exactly satisfies

nexp(12cn2)2πcn=1,

then around x=cn

nexp(12x2)2πxexp(cn(cnx)).

As n increases, this becomes a good approximation for a wider and wider range of cn(cnx) so letting ucn(cnx) we find that

limn(F(ucn+cn)n)=exp(exp(u))=G0(u).

Equivalently,

limn(max{X1,,Xn}cn(ucn)u)=exp(exp(u))=G0(u).

With this result, we see retrospectively that we need lncnlnlnn2 and then

cn2lnn,

so the maximum is expected to climb toward infinity ever more slowly.

Weibull distribution

We may take the simplest example, a uniform distribution between 0 and 1, with cumulative distribution function

F(x)=x for any x value from 0 to 1 .

For values of x1 we have

ln(F(x)n)=nlnF(x)n(1x).

So for x1 we have

F(x)nexp(nnx).

Let u1+n(1x) and get

limn(F(un+11n))n=exp((1u))=G1(u).

Close examination of that limit shows that the expected maximum approaches 1 in inverse proportion to n .

See also

References

  1. Fréchet, M. (1927). "Sur la loi de probabilité de l'écart maximum". Annales de la Société Polonaise de Mathématique. 6 (1): 93–116.
  2. Fisher, R.A.; Tippett, L.H.C. (1928). "Limiting forms of the frequency distribution of the largest and smallest member of a sample". Proc. Camb. Phil. Soc. 24 (2): 180–190. Bibcode:1928PCPS...24..180F. doi:10.1017/s0305004100015681. S2CID 123125823.
  3. 3.0 3.1 von Mises, R. (1936). "La distribution de la plus grande de n valeurs" [The distribution of the largest of n values]. Rev. Math. Union Interbalcanique. 1 (in français): 141–160.
  4. 4.0 4.1 Falk, Michael; Marohn, Frank (1993). "von Mises conditions revisited". The Annals of Probability: 1310–1328.
  5. 5.0 5.1 5.2 Gnedenko, B.V. (1943). "Sur la distribution limite du terme maximum d'une serie aleatoire". Annals of Mathematics. 44 (3): 423–453. doi:10.2307/1968974. JSTOR 1968974.
  6. Mood, A.M. (1950). "5. Order Statistics". Introduction to the theory of statistics. New York, NY: McGraw-Hill. pp. 251–270.
  7. 7.0 7.1 Haan, Laurens; Ferreira, Ana (2007). Extreme Value Theory: An introduction. Springer.

Further reading