4.1 Introduction
In the foregoing chapters, an attempt was made to investigate how the socio-economic and demographic variables were associated with
fertility in Bangladesh. The format of the analysis was mainly two-way
classifications, controlling for duration of marriage and on some occasions, current age of ever married women. No control was made for other correlated variables or for one variable which is the causal
effect of another. This chapter will be devoted to the incorporation
of all the selected socio-economic and demographic variables in a causal manner and determining the direct and indirect contributions of each of these variables to fertility levels (measured by number of
children ever born). The method of analysis is commonly known as
Path Analysis.
Path analysis was originally formulated by Wright (1921, 1934, 1960) and explicated more recently by Duncan (1966), Land (1969),
and Blalock (1971). As a statistical technique, it is no more than
conventional regression analysis with certain assumptions about
linearity, additivity, and causality (Holsinger and Kasarda, 1976:175). It provides algorithms for decomposing zero order correlations among variables in causal models into direct and indirect components.
"The initial assumption for path analysis must be the specification of the causal (or temporal) ordering between the
variables of the model. The data themselves cannot give us any
assistance either for this or for the selection of the variables to
be included in the model. The validity of these assumptions cannot
be evaluated from the data; external criteria or substantive theory must provide the basis for this stage" (Kendall and O'Muircheartaigh,
1977:11). Knowledge of causal relationships can help determine the
ordering of the variables in the model. For example, age can be
considered a variable preceding fertility. A diagrammatic
representation of the proposed model may be used to formulate the structural equations and to have a critical evaluation of the results.
The second assumption involved is that the relationships
between the variables are linear and additive. Such assumptions may
not hold exactly in reality, and although nonlinear and interaction effects are not included in the model, they can be included in it. The relationships between the variables are expressed in a path
analysis diagram by straight, single-headed arrows. Each arrow points
in the direction of the assumed effect. The straight lines with
single arrows are also meant to indicate a unidirectional relationship. The curved, double-headed arrows indicate the correlations between
variables for which no causal implications can be made. They also
indicate mutual dependence of the variables.
An example of a path diagram is given in Figure 4.1, where it is assumed that Z is dependent on two independent or "exogenous"
variables: X and Y. The curved double-headed arrow between X and Y
indicates that these two variables are assumed to be correlated but
that neither is the cause of the other. The straight arrows from X
to Z and from Y to Z express that these two variables, in part,
determine Z. Under the above assumptions it is conceivable that a
unit change in X would have the same effect on Z whatever the values
of the other variables. The variable Y would act on Z in a similar
way.
FIGURE 4.1
Hypothetical Three-Variable Path Diagram R
The third assumption in path analysis is that there is
complete determination of the dependent variables involved. This is
satisfied by the inclusion of variables representing residual factors. They are not standard disturbance terms, but variables not included in the model either purposely or accidentally, resulting from measurement error, and departures of true relationships from linearity and
addivity. In Figure 4.1, such a residual factor is represented by
residuals to ultimate (exogenous) variables. The assumptions regarding residual factors are that they have a mean value of zero and they are
uncorrelated with all prior variables and hence with each other. A
further assumption is that of homoscedasticity (equal dispersion or
spread) of the residual factors. "The violation of the homoscedasticity
assumption produces inefficient, but unbiased, estimates of the
parameters. There are strategies for handling such situations; but
unless the exact form of the heteroscedasticity is known they cannot be handled by an orthodox regression program" (Macdonald, -1977:85).
The last assumption is that the variables are measured at
least on an interval scale. However, there are exceptions to this
constraint. Binary variables (taking values of 0 and 1) can be
included and treated as interval level variables. As dependent
variables they can generate heteroscedasticity problems (Goldberger,
1964:249), but as predictors they are invaluable. Binary variables
can also be assigned numerical scores, because the regression co- effocients being independent of origin, will remain unaffected. Ordinal variables can also be used in path models.
Under the conditions laid down as above, if X^ is assumed to be dependent on X^; X^ on X^ and X^; and Y on X^, X^, and X^; then the system of equations can be written as:
X X Y = B X, + B X 2 21 1 2u u = B X + B X + B X 3 31 1 32 2 3v v = B X + B X + B X ol 1 02 2 03 3 + B XOw w (1)