3.1 Análisis de Datos
3.1.3 Análisis de Encuestas
3.1.3.1 Análisis de encuesta aplicadas a los habitantes de la Comuna San
We often use the term “functional” to mean a function whose arguments are functions. The value of a functional may be any kind of object, a real number or another function, for example. The domain of a functional is a set of functions. I will use notation of the following form: for the functional, a capital Greek or Latin letter, Υ,M, etc.; for the domain, a calligraphic Latin letter, F, G, etc.; for a function, an italic letter, g, F, G, etc.; and for the value, the usual notation for functions,Υ(G) whereG∈ G, for example.
Parameters of distributions as well as other interesting characteristics of distributions can often be defined in terms of functionals of the CDF. For ex- ample, the mean of a distribution, if it exists, may be written as the functional M of the CDFF:
M(F) = Z
ydF(y). (1.109)
Viewing this mean functional as a Riemann–Stieltjes integral, for a discrete distribution, it reduces to a sum of the mass points times their associated probabilities.
A functional operating on a CDF is called astatistical functional or sta- tistical function. (This is because they are often applied to the ECDF, and in that case are “statistics”.) I will refer to the values of such functionals as
distributional measures. (Although the distinction is not important, “M” in equation (1.109) is a capital Greek letter mu. I usually—but not always— will use upper-case Greek letters to denote functionals, especially functionals of CDFs and in those cases, I usually will use the corresponding lower-case letters to represent the measures defined by the functionals.)
Many statistical functions, such asM(F) above, are expectations; but not all are expectations. For example, the quantile functional in equation (1.113) below cannot be written as an expectation. (This was shown byBickel and Lehmann (1969).)
Linear functionals are often of interest. The statistical functionM in equa- tion (1.109), for example, is linear over the distribution function space of CDFs for which the integral exists.
It is important to recognize that a given functional may not exist at a given CDF. For example, if
F(y) = 1/2 + tan−1((y
(that is, the distribution is Cauchy), thenM(F) does not exist. (Recall that I follow the convention that when I write an expression such asM(F) orΥ(F), I generally imply the existence of the functional for the givenF. That is, I do not always use a phrase about existence of something that I implicitly assume exists.)
Also, for some parametric distributions, such as the family of beta distri- butions, there may not be a “nice” functional that yields the parameter.
A functional of a CDF is generally a function of any parameters associated with the distribution, and in fact we often define a parameter as a statistical function. For example, ifµandσare parameters of a distribution with CDF F(y;µ, σ) andΥ is some functional, we have
Υ(F(y;µ, σ)) =g(µ, σ),
for some functiong. If, for example, theM in equation (1.109) above isΥ and theF is the normal CDFF(y;µ, σ), thenΥ(F(y;µ, σ)) =µ.
Moments
For a univariate distribution with CDFF, therth central momentfrom equa- tion (1.50), if it exists, is the functional
µr=Mr(F)
=R(y−µ)rdF(y). (1.111) For general random vectors or random variables with more complicated structures, this expression may be rather complicated. Forr= 2, the matrix of joint moments for a random vector, as given in equation (1.69), is the functional
Σ(F) = Z
(y−µ)(y−µ)TdF(y). (1.112)
Quantiles
Another set of useful distributional measures for describing a univariate dis- tribution with CDFF are the quantiles. Forπ∈]0,1[, theπquantileis given by the functionalΞπ(F):
Ξπ(F) = inf{y, s.t. F(y)≥π}. (1.113) This functional is the same as the quantile function or the generalized inverse CDF,
Ξπ(F) =F−1(π), (1.114) as given in Definition1.15.
The 0.5 quantile is an important one; it is called the median. For the Cauchy distribution, for example, the moment functionals do not exist, but
the median does. An important functional for the Cauchy distribution is, therefore, Ξ0.5(F) because that is the location of the “middle” of the distri- bution.
Quantiles can be used for measures of scale and of characteristics of the shape of a distribution. A measure of the scale of a distribution, for example, is the interquartile range:
Ξ0.75−Ξ0.25. (1.115)
Various measures of skewness can be defined as (Ξ1−π−Ξ0.5)−(Ξ0.5−Ξπ)
Ξ1−π−Ξπ
, (1.116)
for 0 < π < 0.5. For π = 0.25, this is called the quartile skewness or the
Bowley coefficient. Forπ = 0.125, it is called theoctile skewness. These can be especially useful with the measures based on moments do not exist. The extent of the peakedness and tail weight can be indicated by the ratio of interquantile ranges:
Ξ1−π1−Ξπ1
Ξ1−π2−Ξπ2
. (1.117)
These measures can be more useful than the kurtosis coefficient based on the fourth moment, because different choices of π1 and π2 emphasize different aspects of the distribution. In expression (1.117), π1= 0.025 andπ2= 0.125 yield a good measure of tail weight, andπ1= 0.125 andπ2= 0.25 in expres- sion (1.117) yield a good measure of peakedness.
LJ Functionals
Various modifications of the mean functionalM in equation (1.109) are often useful, especially in robust statistics. A functional of the form
LJ(F) = Z
yJ(y) dF(y), (1.118)
for some given function J, is called an LJ functional. If J ≡ 1, this is the mean functional. OftenJ is defined as a function ofF(y).
A “trimmed mean”, for example, is defined by an LJ functional with J(y) = (β−α)−1I
]α,β[(F(y)), for constants 0 ≤ α < β ≤ 1 and where I is the indicator function. In this case, the LJ functional is often denoted as Tα,β. Oftenβ is taken to be 1−α, so the trimming is symmetric in probability content.
Mρ Functionals
Another family of functionals that generalize the mean functional are defined as a solution to the minimization problem
Z
ρ(y, Mρ(F)) dF(y) = min θ∈Θ Z
ρ(y, θ) dF(y), (1.119) for some functionρand whereΘis some open subset of IRd. A functional de- fined as the solution to this optimization problem is called anMρ functional. (Note the similarity in names and notation: we call theM in equation (1.109) the mean functional; and we call the Mρ in equation (1.119) the Mρ func- tional.)
Two related functions that play important roles in the analysis of Mρ functionals are ψ(y, t) = ∂ρ(y, t) ∂t , (1.120) and λF(t) = Z ψ(y, t)dF(y) = ∂ ∂t Z ρ(y, t)dF(y) (1.121) Ify is a scalar andρ(y, θ) = (y−θ)2 then M
ρ(F) is the mean functional from equation (1.109). Other common functionals also yield solutions to the optimization problem (1.119); for example, forρ(y, θ) =|y−θ|,Ξ0.5(F) from equation (1.113) is an Mρ functional (possibly nonunique).
We often choose theρin anMρ functional to be a function ofy−θ, and to be convex and differentiable. In this case, theMρfunctional is the solution to
E(ψ(Y −θ)) = 0, (1.122)
where
ψ(y−θ) = dρ(y−θ)/dθ, if that solution is in the interior ofΘ.