The analysis of data in the research study is approached in two ways; first the bivariate relationship of the dependent variable with the predictor variables listed above under each of the proximate variable groups is examined. For this purpose two indices are used: percentage of children who are undernourished determined from those having standard deviation scores of -2.00 or below; and mean standard deviation scores (z- scores). These relationships at the bivariate level are, however, not the main focus of the research, which attempts to assess the influence that socio-economic and
behavioural influences have on the dependent variable. Tnese relationships are, as the Moslev-Chen framework shows, by no means simple. Various proximate determinants can have effects in different directions: some characteristics can affect the given indicator of nutritional status to fall below the cut-off while others can push the indicator above the cut-off. Therefore a suitable multivariate approach has to be employed.
In accordance with the objectives of the research, the dependent variable — the indicator of nutritional status expressed in terms of z-scores — has been treated as dichotomous (undernourished, 1 and better-nourished, 0), and therefore a logistic regression procedure is appropriate for the purpose. The logistics regression procedure is increasingly being used for such data, particularly in the epidemiological field where outcome variable is binary (Kienbaum, Küpper and Muller, 1978). The regression model fitted to the data is as follows:
Log (p/l-p) = B r B^X^ + B2X0+...“BuXn
where p is the probability of a child between the ages of 3 and 36 months being well nourished.
= regression coefficients
Xt.-.X^ = independent (or predictor) variables.
The model illustrated above assumes that the relationship between the predictor
variables and the dependent variable is additive. In certain circumstances this
assumption may be correct but some of the independent variables also affect the dependent variable interactively as well. When such interactions are taken into account the model can be expressed as follows:
Log(p/l-p) = B-r BjXj+ B2X2 + B3X3X + B4X4 + • -rBnnXn.
+ X 1 (Bu Xl+Bi;X: +.. + BInXn) +.. Bmx y
where By's are coefficients of the interactions between X^Xj.
Generalized Linear Interactive Modelling (GLIM) developed by the Royal Statistical
Society of London (Aitken et al, 1989). The GLIM program calculates, for each model,
the overall goodness of fit, regression parameters of the independent variables, and their standard errors. The goodness of fit statistics are known as the deviance (maximum likelihood estimates) and degrees of freedom. The model selection was done using a
forward entry method by estimating the y} values by comparing the values of deviance
and residual degrees of freedom in successive hierarchical models. The variables for the model were selected on the basis of their levels of statistical significance.
While the overall deviance is a sort of goodness of fit of a variable, the vf statistic shows the statistical significance of each of the categories within a variable. They are obtained by dividing the deviance values by respective standard errors, and if the resulting value exceeds plus or minus 1.98, the variables are statistically significant at p < 0.05 level.
For a given variable the estimates provided by the logistic regression are with respect to one category of the variable; usually the first category. This is called the reference category. In the tables of regression parameters odds ratios shown refer to the exponential of the parameter estimates. The odds ratios show the relative level of risk for each category of a variable compared to the reference category. The risk in the present case is the chances of a child 3 months to 36 months of age being better nourished. The odds ratio for the reference group is by definition equal to 1.0. The odds ratio of 1.0 for any other category indicates a relative risk equal to that of the reference category; an odds ratio of less than 1.0 shows a relatively low level of nutrition, an odds ratio of more than 1.0 relatively good chances of being better nourished than the reference category.
As described in chapter 2 although the PPS was used for the selection of the SLDHS sample. Certain socio-economic zones, zone 5 with a large concentration of estate population and zone 7 where rainfed farming is prevalent, were oversampled. It should be remembered that the summary statistics estimated (such as the mean, frequencies and even life table values) for the whole sample and its subgroups have
been duly weighted by an appropriate weighting factor which are given in the SLDHS report (Department of Census and Statistics, 1988:10) In the statistical models fitted in Chapters 6 and 7, however, unweighted data were used.1
While the main model analyses the relationships between the dependent variable and the set of predictor variables, other approaches are also used in the research in analysing data; they are discussed in the relevant chapters. Chapter 3 having discussed the indicators of nutrition, diagnostic indicators of nutrition and cut-off points, and the statistical methodologies adopted in the research, Chapter 4 briefly explores and describes the socio-economic and behavioural characteristics of the study population.
1 There is a great deal of controversy over the weighting of data from complex data files. According