Structural equation modelling (SEM) is a tool for hypotheses testing and deducting
relationships among observed and latent variables. As a technique it allows for additional
insight beyond what is possible by regressions. While multiple regressions can just examine a
single relationship at a time, SEM can examine a series of dependence relationships
simultaneously (Hair et al, 2010). Most research is analysing interrelated questions. SEM
118
Structural equation modelling is thus a more detailed method for calculation of statistical
connections, variables and dependence. Compared with the traditional regression models,
SEM develops a more complex and complete calculation of the full model. Analyses based on
SEM might thus more easily be further developed and strengthened by recalculations of the
initial model since the output file includes information of strengths and weaknesses in the
model, as well as improvable potentials. Structural equation modelling typically tests how
well the observed data fit a restricted structure, by imposing the structure of the hypothesized
model on the sample data (Byrne, 2001). Fitting a model to data is the same as solving a set of
equations.
3.7.1 Variables in SEM
It is useful to distinguish terminologies of different types of variables used in structural
equation modelling, namely latent variables versus observed variables; and exogenous versus
endogenous variables (Byrne, 1988).
Latent variables are those variables that cannot be observed and measured directly. A latent
variable is linked to variables which are observable, and thereby making its measurement
possible. These measured scores are termed observed variables, and serve as indicators of the
latent variable which they are presumed to represent in structural equation.
Exogenous variables are synonymous with independent variables. They cause fluctuations in
the values of other variables in the model. Endogenous latent variables are synonymous with
dependent variables, and are thus influenced by the exogenous variables in the model, either
directly or indirectly. Changes in the values of exogenous variables are not explained by the
model, while fluctuations in the values of endogenous variables should be explained by the
model because all variables that influence them principally are included in the model
119
Structural equation modelling can usually be separated into a measurement model and a
structural model. The measurement models address the reliability and validity of the
indicators in measuring the latent variables, while the structural model specifies the direct and
indirect relations among the latent variables and describes the amount of explained and
unexplained variance in the model (Byrne, 2001, p.3).
In SEM a two-step modelling approach is recommended, emphasizing the analyses of two
conceptually distinct latent variable models: measurement and structural. The testing of the
structural model, which is originally the testing of the initially specified theory, may be
meaningless unless it is first established that the measurement model is correct (Joreskog,
1993). If the chosen measurement variables are biased or wrong, the specified theory must be
modified before testing. Therefore, the measurement model should be tested before the
structural relationships are tested.
This study follows this advice. Before testing the structural models, which is reported in
Chapter IV, the measurement model for each construct is tested and reported in section 3.10
in this chapter. The convergent and discriminant validity of measurement constructs are thus
established before moving on to the analysis of the structural model.
3.7.2 Testing by structural equation modelling
There are three generic strategies for testing structural equation models (Joreskog, 1993);
strictly confirmatory, alternative models, and model generating. When applying the strictly
confirmatory strategy, we postulate a single model based on the theory, collect appropriate
data, and then test the fit of the hypothesized model to the sample data. From the results of
this test, we either reject or fail to reject the model; no further modification to the model is
120
Under the alternative models strategy, several alternative models from the theory are
proposed. Following analysis of a single set of empirical data, the model being most
appropriate in representing the sample data is selected.
Further, there will be three different degrees of model identification in structural equation
modelling: just-identified, over identified, or under identified (Byrne, 2001, p.35):
A just-identified model is one in which there is a one-to-one correspondence between the data and the structural parameters, which means that the number of data variances
and covariances equals the number of parameters to be estimated.
In an under identified model the number of parameters to be estimated exceeds the number of variances and covariances. Thereby, the model does not contain sufficient
data for the purpose of attaining a determinate solution of parameter estimation.
An over identified model is a model where the number of estimate parameters is less than the number of data points (variances, covariances of the observed variables). The
result will be positive degrees of freedom that allow for rejection of the model. When
testing the hypotheses, this research applies the method of an over identified model.
As discussed above, in statistical analyses we usually assume that the sample data follow a
multivariate normal distribution. This implies that the means and covariance matrix contain
all the information. The basic model is DATA=MODEL+ ERROR. The aim of the SEM-
analyses is an estimation of model parameters that can be well fitting representatives of the
corresponding population values. The method most widely used for estimation is Maximum
Likelihood estimation, which assumes multivariate normal data and a reasonable sample size,
normally with a minimum of 100 - 500 cases. The exact number will depend on the number of
constructs in the equation system. Further, the less number of items of measurements, the
more demand for a higher number of cases (Hair et al, 2010, p.662). Altogether, the main
121
1. The sample is large
2. The distribution of the observed variables is multivariate normal
3. The hypothesised model is valid
4. The scale of the observed variables is continuous (Byrne, 2001)
In this research, Maximum Likelihood estimation method is used in the SEM analysis. The
scale of the observed variables is continuous (5-point Likert-type scale). In addition, the
hypothesised model was developed from systematic review of theories and extant research
findings. Therefore, the data used in this study meets the above criteria 1, 3 and 4. Regarding
the requirement of normal distribution of observed variables, Micceri (1989) points out that
true normality is exceedingly rare in education and psychology. West, Finch and Curran
(1995) further suggest that normality should be examined univariately and multivariately.
Still, most of the variables in the study satisfy a normal distribution. In this research project
all the four assumptions listed above are thus satisfied.
3.7.3 Structural equation modelling - critique
As explained, structural equation modelling contains various powerful analysis techniques,
and has a positive impact on research in the applied fields. However, some issues have been
raised against the use of structural equation modelling (Hair et al, 2010). One of the issues is
the importance of statistical assumptions of normally distributed data and sample sizes to
obtain confidence in results. The restrictions of sample size can have significant impact on the
outcomes of structural equation modelling. Another issue is related to the causal interpretation
in structural equation modelling. Even if we find correlations, it does not necessarily mean
there is a causal relationship, or if this model is the most correct in describing a relationship
(Cliff 1983). Further SEM has got a limitation for samples with missing data. While SPSS
122
that missing values will have to be calculated by the mean of the existing observations. This
issue has to be handled in cases when SEM is applied for samples with a high amount of
missing values. In this study missing values are limited and do not represent an issue.
However, to meet these issues, regression analyses will be conducted in combination with
SEM. Such a combination of the two analyses should contribute to a more solid confirmation
of the results.
3.7.4 Testing SEM results
In SEM there are several tests which should be conducted to assure the reliability and validity
of the measurements. There are three groups of fit tests in SEM. First, absolute fit indices
measure how well the specified model fits the data. These fit indices include chi-square tests,
goodness of fit index (GFI), root mean square error of approximation (RMSEA) and root
mean square residual (RMR). Further, other fit indices are testing how well the estimated
model fits relative to some alternative model (incremental fit indices), or which model among
a set of competing models is best (parsimony fit indices) (Hair et al, 2010, p.666-669).
Incremental fit indices are commonly compared with a null model, where all variables are
uncorrelated. Fit improvements can be obtained by specifying related multi-item concepts
(Hair et al, 2010, p.668). For parsimony fit the models are compared relative to the
complexity. Improvements are possible by better fit or simpler models (Hair et al, 2010,
p.669).
Details of tests within the groups are reported together with the results in section 4.2.5 (factor
analyses) and 4.3.2.