Otras miradas de las políticas educativas en Colombia

1. Problema de investigación

2.4. Otras miradas de las políticas educativas en Colombia

6.2.1 Independent observations and random variables

In a classical estimation problem, we have a parametric family (P_θ0)θ∈Θ of precise probability distributions on a sample space (X,A0_{) . The task is to estimate the true parameter} θ0 ∈ Θ . Most often, it is assumed that the estimation can be based on a whole set of data

x1, . . . , xn ∈ X

which are independent identically distributed according to the true distributionP_θ0

0. That

is, the vector x = (x1, . . . , xn) consisting of all observations is distributed according to

the product measure P_θ0

⊗n

In a (more realistic) imprecise probability setup, it is natural to replace the precise model (P_θ0)θ∈Θ by an imprecise model (P

θ)θ∈Θ which consists of coherent upper previsions P 0

θ.

Hence, it is assumed that the data

x1, . . . , xn ∈ X

are independent identically distributed according to the true P0_θ

0 or – in other words –

the vector x = (x1, . . . , xn) consisting of all observations is distributed according to a

coherent upper product prevision P0 ⊗_θ n

0 .

As stated in the introductory Section 6.1, there are several different ways to define such products of coherent upper previsions. In the following, the type-2 product8 _{is used which} corresponds to a strict sensitivity analyst’s point of view. This product prevision is defined to be that coherent upper prevision

P0 ⊗_θ n : L∞ Xn,A0⊗n

→ _R

7_{See e.g. Parr and Schucany (1980), Millar (1981), Donoho and Liu (1988), (Rieder, 1994,} _§_{6) and}

Ozt¨urk and Hettmansperger (1998)

which has credal set

c`coP_θ0⊗n P_θ0 ∈ M0_θ where M0

θ denotes the credal set of P

θ.

Though this definition of the type-2 product is commonly used, it is not enough elaborated for the following investigations. This is because the minimum distance estimator is based on the empirical measure and, therefore, we have to deal with stochastic processes. In this context, a detailed mathematical formulation of the setup is necessary. This is done by use of random variables and image measures in classical probability theory and mathematical statistics. In the following, it is shown how this formalization can be adopted for imprecise probabilities.

Firstly, let us recall the classical setup: There, a random observation or data point x0 in a set X is mathematically formalized by a map

X0 : Ω → X, ω → X0(ω)

where Ω is a fixed set which is rarely specified more closely. There are a fixedσ-algebraF on Ω and a fixedσ-algebra A0 _on_X _{and it is assumed that}_X

0 is measurable with respect to these σ-algebras. X0 is calledrandom variable.

Next, it is assumed that an unspecified event ω has randomly happened which, by (de- terministic) physical principles, has led to the observation

x0 = X0(ω)

The events ω ∈ Ω are distributed according to a (precise) distribution U or a (precise) distribution Uθ on (Ω,F) where θ is an unknown parameter.

Let A0 ∈ A0 _{be a measurable subset of} _X_{. Then, the probability that the observation} _x 0 lies in A0 is equal to Uθ {ω ∈Ω| X0(ω)∈A0}

That is, x0 is distributed according to the precise probability measure P_θ0 : A0 → [0,1], A0 7→ Uθ

{ω∈Ω| X0(ω)∈A0}

(6.2)

This defines a (precise) statistical model (P_θ0)θ∈Θ for the observation x0. Pθ0 defined by

(6.2) is called image measureof Uθ under X0 and is denoted by Pθ0 =X0(Uθ) .

A whole set of observations/data x1, . . . , xn, is modeled via several random variables

Xi : Ω → X, i∈ {1, . . . , n}

Accordingly, it is assumed that the (unspecified) event ω ∈ Ω has led to the observations/data

x1 = X1(ω), . . . , xn = Xn(ω)

The random variables

are calledindependent identically distributedwith respect toUθif their joint image measure

is equal to the product of the single image measures and these image measures coincide:     X1 · · Xn     Uθ = X1(Uθ)⊗ · · · ⊗Xn(Uθ) = Pθ0 ⊗n

Now, let us turn over to imprecise probabilities again: Due to our sensitivity analyst’s point of view, it is assumed in the imprecise probability setup that there is a coherent upper prevision Uθ and the distribution Uθ of the events ω ∈ Ω is unknown and can be

any element of the credal set Uθ of Uθ.

Analogously to the ordinary image measure, we can define the image of a coherent upper prevision:

Definition 6.1 The upper coherent prevision P0_θ on L∞(X,A0₎ _{which corresponds to the}

credal set

M0_θ = X0(Uθ)

U_θ ∈ U_θ (6.3)

is called image ofUθ under X and is denoted by

P0_θ = X(Uθ)

Lemma 6.2 below shows that this is defined well. That is, the image of a coherent upper prevision is again a coherent upper prevision. This provides a nice generalization of classical probability theory which is based on the fact that the image of a probability measure is again a probability measure.

In this way, we get an imprecise model (P0_θ)θ∈Θ. Since Uθ is any element of the credal

set Uθ, the distribution of the observation x0 modeled by the random variable X0 is any element of the credal set M0

θ. The essential difference to the precise setting is the that,

given θ, the true Uθ ∈ Uθ and, accordingly, the true Pθ0 ∈ M

θ are totally unknown.

Lemma 6.2 Mθ defined by (6.3) is a credal set on (X,A0).

Proof: The map

ξ : ba(Ω,F) 7→ ba(X,A0), ν 7→ ξ(ν) defined by

ξ(ν)(A0) = ν X₀−1(A0)

is linear and continuous with respect to the L∞(Ω,F) - topology on ba(Ω,F) and the L∞(X,A0_{) - topology on ba(X}_,_A0_{) . Together with}

ξba+₁(Ω,F) ⊂ ba+₁(X,A0)

this implies that ξ(Uθ) is a convex and L∞(X,A0) - compact subset of ba+1(X,A0) . Ac- cording to Corollary 2.16, ξ(Uθ) is a credal set and the definitions imply

ξ(Uθ) = Mθ

Just as in the precise case, it is assumed that the random variables Xi : Ω → Xi, i∈ {1, . . . , n}

are independent identically distributed. That is, the joint distribution of observations is equal to     X1 · · Xn     Uθ = X1(Uθ)⊗ · · · ⊗Xn(Uθ) = Pθ0 ⊗n

Since Uθ may be any element of Uθ, the distribution of the vector x = (x1, . . . , xn) con-

taining all observations may be any element of

N_θ0 := P_θ0⊗n P_θ0 ∈ M0_θ

This set of product probabilities defines a coherent upper prevision P0 ⊗_θ n : L∞ Xn,A0⊗n → _R, g0 7→ sup P0 θ ⊗n_∈N0 θ P_θ0⊗n[g]

According to Proposition 2.15, the credal set of this coherent upper prevision is equal to c`co

P_θ0⊗n P_θ0 ∈ M0_θ

so that, in fact, we end up with the usual type-2 product of coherent upper previsions again.

Note that the credal sets M0

θ may also contain probability charges which are not σ-

additive. Products of probability charges such as P_θ0⊗n are defined according to (K¨onig, 1997, Proposition 20.4). However, these products are not defined on the productσ-algebra A0 ⊗n _{but on the (usually) smaller product algebra denoted by} _A0⊗ˆn_{. This is the smallest}

algebra on Xn _{which contains all rectangles}

A0₁× . . . ×A0_n ⊂ Xn where A0₁, . . . , An ∈ A0

That is,P0 ⊗_θ n is defined on the product algebraA0⊗ˆn_{at first. Next,}_P0 ⊗n

θ can be extended

to a coherent upper prevision on the usual product σ-algebra A0 ⊗n _{by natural extension.}

6.2.2 Discretizations in estimation problems

As argued in Subsection 5.4.1, discretizing the parameter space Θ may be considered as part of modeling in estimation problems because coarsening Θ also means to change the purpose of the estimation problem and this change of the purpose is desirable from the point of view of the theory of imprecise probabilities; confer Subsection 5.4.1.

Modelers will nevertheless often produce an infinite parameter space Θ . Therefore, an ad hoc method for discretizing Θ is developed in the following:

Let Θ be any index set and (P0_θ)θ∈Θ be an imprecise model on a sample space (X,A0) . For every θ ∈Θ , letM0_θ be the credal set of P0_θ on (X,A0) .

In order to discretize Θ, let

be a finite partition of Θ . Now, the parameter set in our estimation problem is H and we want to estimate the trueH ∈ H. This is the setH ∈ Hin which the true parameter θ lies. That is, we do not want to discriminate between different elements θ1 and θ2 of one H any more. In this sense, the estimation problem gets coarser. The (upper) risk function depending on H ∈ H is canonically defined by

H → _R, H 7→ sup θ∈H sup P_θ0∈M0 θ Z X Z Θ Wθ(ˆθ)τx(dθ)ˆ Pθ0(dx) (6.4)

where (Wθ)θ∈Θ⊂ L∞(Θ,2Θ) is a loss function andτ is a (randomized) decision function, i.e. an estimator. Since we do not want to discriminate between different elements θ1 and θ2 of one H, it is natural to choose a loss function which does only depend onH and not on the specific θ; that is, we have a loss function

(WH)H∈H ⊂ L∞(H,2H)

Furthermore, the decision space changes from Θ to H and the risk function becomes H → _R, H 7→ sup θ∈H sup P_θ0∈M0 θ Z X Z H WH( ˆH)τx(dH)ˆ Pθ0(dx) (6.5)

for an estimator τ. Next, put

M0_H := c`co [

θ∈H

M0_θ , ∀H ∈ H where c`co denotes the convex L∞(X,A0_{) - closure. That is,} _M0

H is the credal set of the

coherent upper prevision P0_H defined by

P0_H : L∞(X,A0) → _R, f 7→ sup

θ∈H

P0_θ[f] According to Lemma 8.29, the risk function defined by (6.5) is equal to

H → _R, H 7→ sup P0 H∈M0H Z X Z H WH( ˆH)τx(dH)ˆ Pθ0(dx) (6.6)

and this function exactly coincides with the usual risk function defined in Section 3.2 if (P0_H)H∈H is our imprecise model. That is, discretizing Θ naturally leads to the imprecise model (P0_H)H∈H, where H is a finite index set.

Of course, a thoughtless application of this discretization may lead to very bad results. This is because discretizing Θ means that we do not want to discriminate between different elementsθ1 andθ2 of oneH and, therefore, it is crucial to choose a sensible partition of Θ in order to get sensible results – the more since choosing a partition of Θ means choosing the statistical purpose.

So far, this method can be justified well. However, problems arise in applications since it is a necessary assumption for the applications presented in the present book that credal sets are given by a finite number of restrictions; cf. e.g. (5.30). However, even if there is a finite set K ⊂ L∞(X,A0_{) such that}

M0_θ = P_θ0 ∈ba+₁(X,A0) P_θ0[f]≤P 0

it does not seem to be clear if assumption (5.30) is fulfilled for M0

H which would be

necessary to successfully work with M0

H in our applications. An ad hoc solution of this

problem is to use the credal set ˆ

M0_H = P_H0 ∈ba+₁(X,A0) P_H0 [f]≤P 0

H[f] ∀f ∈ K

as an “approximation” of M0

θ. It is easy to see that

M0_θ ⊂ Mˆ0_H

After that, (X,A0_{) may be discretized according to Subsection 5.4.2 where the index set} is given by H.

In document Subjetividad política docente: entre el discurso, las políticas educativas y la práctica pedagógica (página 53-65)