• No se han encontrado resultados

5. Marco teórico

5.4. El espacio, el territorio, la región y el lugar, conceptos para la construcción de apuestas

5.4.6. El Lugar

models

If a set of observational data (e. g., in the form of a joint probability function) is available to the empirically based causal analyst who wants to evaluate the causal influence of one observed variable X on a distinct second, correlated variable Y, he is best advised to firstly exclude that the assessed correlation is actually induced by other factors than the potential causal influence to be examined. So-called spurious correlation

betweenXandY is generated byconfounding factors Z influencing both X and Y at the same time, confounding our analysis, and ultimately biasing the estimate of the influence under consideration. This goes under ‘confounding bias’ in the respective literature.61

60

Cf. definition 2.7.3 (Spurious Association) in [Pearl 2009, pp. 55 f.].

61

From Lewis to Pearl 73

As an example from econometrics, consider Okun’s law which maps the relationship between unemployment and economic growth within national economy and postulates in compact manner linear dependence (which is one of the reasons for the law being so popular). Misinter- preting this dependence one might state: One necessary requirement for decreasing unemployment is strong economic growth. Critics of Okun’s law point to the fact that long-term variances of other parameters equally important within national economy (as productivity, working time, job offers) tend to significantly confound the direct relationship between the examined quantities unemployment and economic growth. These addi- tional parameters, however, do not occur in the formulation of the law. To speak of cause, effect, causal relation, or causal influence in either direction would certainly overstrain Okun’s law.

In his book Causality (2000/2009) Pearl makes out why the concept of confounding goes largely unheeded in statistics course books:

As simple as this concept is, it has resisted formal treatment for decades, and for good reason: The very notions of “effect” and “in- fluence” – relative to which “spurious association” must be defined – have resisted mathematical formulation. The empirical defini- tion of effect as an association that wouldprevail in a controlled randomized experiment cannot easily be expressed in the standard language of probability theory, because that theory deals with static conditions and does not permit us to predict, even from a full spec- ification of a population density function, what relationships would prevail if conditions were to change – say, from observational to controlled studies. Such predictions require extra information in the form of causal or counterfactual assumptions, which are not discernible from density functions [. . . ]62

Density functions, i. e., probabilistic descriptions, precisely talk about

closed worlds andfixed environmental conditions, whereas in the frame- work of structural causal models the do(·)-operator serves as an effi- cient tool for the virtual inspection of dependencies in alternative test scenarios.

Now, as soon as the researcher – merely on the basis of given non- experimental and purely observational data – sets out to model a certain situation, building a causal graphGand putting together the list of func- tional relationships renders it possible to examine, whether the causal

62

influence of one factor X on another one,Y, can be uniquely estimated, at all – always, of course, within the scope of his modeling. Pearl explicates the central idea behind this in his definition of identifiability: Definition 2.10.1 (Pearl’s Identifiability of Causal Effects)63

The causal effect of X on Y is said to be identifiable if the quantity

P(y|do(x)) can be computed uniquely from any positive distribution of the observed variables that is compatible with G.

Especially when there exist latent variables in the model potentially causally influencing bothX andY at the same time (if only indirectly), a quantitative estimate within the model must be adjusted by means of other observed concomitant variables to exclude confounding bias and

spurious correlation. How is a set of variables fit for this task to be found?

To efficiently accomplish the search for a suitable variable set, Judea Pearlformulates two criteria applicable again to the graph of a causal model. Making use of the so-called back-door and front-door criterion

enables the researcher to easily identify from the diagram the set of nodes Z (of course representing the corresponding variables in the probability distribution compatible with G) with which confounding influences can be subtracted out. This adjustment is achieved by suitably summing up the potential values of all variables in Z. The two criteria shall be presented in due brevity in the following:

Definition 2.10.2 (Pearl’s Back-Door Criterion)64

A set of nodes Z satisfies the back-door criterion relative to an ordered pair of nodes hXi, Xji in a DAG G if:

(i) no node in Z is a descendant of Xi; and

(ii) Z blocks every path between Xi and Xj that contains an arrow

pointing towards Xi.

Analogously, Z satisfies the back-door criterion relative to two disjoint sets of variables hX, Yi if Z satisfies the back-door criterion relative to any pair hXi, Xji with Xi ∈X andXj ∈Y.

63

Cf. definition 4 in [Pearl 1995, p. 674], slightly adjusted here to maintain consis- tent notation.

64

From Lewis to Pearl 75

Such a set Z accordingly d-separates all paths that would leave open a

back-door intoXifor some possible confounding factor – hence the name of the criterion. In the miniature example given in the left graph of figure 2.10 the direct influence of variable Xi on variable Xj shall be assessed – obviously along the pathXiAX6AXj. Employing the d-separation criterion, potential confounders outside the path Xi AX6 AXj can be made out, i. e., variables that – within the causal diagram – influence both Xi and Xj simultaneously when wiggled qua modification. The back-door criterion now identifies the minimal sets{X3, X4}, or{X4, X5} alternatively, as sufficient for screening off spurious influences. X4 alone would not do the job, because although – according to the definition of the d-separation criterion – the path Xi B X4 A Xj would become blocked by conditioning onX4, quite on the contrary X4 opens theflow of information along the outer path viaX1 and X2 as the collider node in this v-structure. X1 X2 X3 X4 X 5 X6 Xi Xj U (Unobserved) Xi Z Xj

Fig. 2.10: In the left diagram the effect of Xi on Xj can be estimated con-

sistently by means of adjusting for the variable pairs {X3, X4} or

{X4, X5}; the right diagram illustrates adjustment forZ by apply-

ing the front-door criterion.

Now, thefront-door criterion takes care of those cases in which pos- sible back-door paths run through unobserved variables, which are of course not apt for being a candidate set Z possibly screening off spuri- ous correlation in computation – unobserved variables cannot be adjusted for. Pearl’s graphical solution:

Definition 2.10.3 (Pearl’s Front-Door Criterion)65

A set of nodes Z satisfies the front-door criterion relative to an ordered pair of nodes hXi, Xji in a DAG G if:

(i) Z blocks all directed paths from Xi to Xj;

(ii) there are no unblocked back-door paths from Xi to Z; and

(iii) all back-door paths from Z to Xj are blocked byXi.

Here, conditions(i)through(iii)precisely indicate such sets of nodes me- diating the (otherwise unconfounded) influence of some Xi on someXj.

Warm-up Exercises (X) Intra-game Proprioception Injury (Outcome) Team Motivation, Aggression Neuromuscular Fatigue Contact Sport Tissue Weakness Previous Injury Pre-game Proprioception

Fitness Level Connective Tissue Disorder

Coach Genetics

Fig. 2.11: A complex causal diagram illustrating the effect of warm-up exercises

X on an athlete’s susceptibility to injury Y (taken from [Shrier & Platt 2008, figure 2]).

A more complex example from medical practice, illustrated in figure 2.11, relates potential factors contributing to or preventing some athlete’s susceptibility to injury while exercising the respective sport.66 The effect of warming up before the game (represented by X) on the danger of

injury (the outcome, Y) is to be tested. The mediating variable intra- game proprioception measures the athlete’s balance and muscle control. In the upper part of the diagram the coach influences team motivation

and aggression during the game which in turn makes an earlier injury

more probable, just as participating in warm-up exercises. Coach and

65

Definition 3.3.3 in [Pearl 2009, p. 82].

66

Cf. for this and the following the presentation of this example case in [Shrier & Platt 2008].

From Lewis to Pearl 77

genetic predisposition together contribute to the athlete’s fitness level, and so forth. The question (in the center of the graph), if the respective game falls under the category ofcontact sport or not, also influences the probability of a previous injury independently ofteam motivation. The influence of warm-up exercises on potentialinjury is obviously con- founded by a multitude of factors. Application of the back-door crite- rion facilitates the search for a set of nodes Z which helps adjusting confounding factors: Those variables measuring neuromuscular fatigue

and possibletissue weakness are jointly sufficient for screening off spuri- ous influences because they intercept all back-door paths from X to the putative outcome Y. The path running through the question, ifcontact sport or not, isinactive without conditioning anyways, since it contains a collider node – an inverted fork. Previous injury is thus to be excluded from the adjusting set of variables Z, because gaining knowledge about

previous injuries precisely opens a back-door again, thereby establish- ing indirect dependence betweenwarm-up exercises andsusceptibility to injury. Pearl’s graphical criteria facilitate the identification of con- founders and adjusting variables even in this rather complex example from medical practice.

Now, in case the effect of some variable on a second variable turns out to be identifiable in a given causal model, Pearl offers a set of rules, sound and complete, for the reduction of probabilistic expressions containing the do(·)-operator to expressions without it. The so-called

do-calculus enables the researcher to estimate post-intervention quanti- ties merely from non-experimental, observational distributions. For the case that a set of variables screening off some causal flow from spurious influences can be made out – either by employing the back-door criterion or the front-door criterion – Pearlmoreover presents two formulae for adjustment in [Pearl 1995], also restated in [Pearl 2009, pp. 79 ff.]. The following two respective theorems shall be given for the sake of complete- ness and conclude this section before the concept of token causation will be examined more closely below:

Theorem 2.10.4 (Pearl’s Back-Door Adjustment)67

If a set of variables Z satisfies the back-door criterion relative tohX, Yi, then the causal effect ofXonY is identifiable and is given by the formula

P(y|do(x)) =X

z

P(y|x, z)P(z). (2.18)

67

Theorem 2.10.5 (Pearl’s Front-Door Adjustment)68

If a set of variables Z satisfies the front-door criterion relative to hX, Yi and if P(x, z) >0, then the causal effect of X on Y is identifiable and is given by the formula

P(y|do(x)) =X

z

P(z|x)X

x′

P(y|x′, z)P(x). (2.19)