CAPÍTULO 3: DESPLIEGUE DEL PROCEDIMIENTO PARA LA GESTIÓN DE
3.2.1 Identificar Interesados
PLS path modelling was considered to be a potential method to test H#3 and H#4, because it is a logical alternative if multiple regression or SEM fails to produce a meaningful solution (Hair et al., 2010). Further, PLS path modelling has become increasingly more popular in the last decade, particularly for business and marketing research (Anderson & Swaminathan, 2011; Henseler, Ringle, & Sinkovics, 2009; Temme, Kreis, & Hildebrandt, 2006; Wetzels, Odekerken-Schröder, & van Oppen, 2009). The analysis is also popular in public health research, and has been used extensively to construct models of organisational cultures in the health care settings. Similar to the current study, these include models based on questionnaire data collected from health care settings (Gallos, Daskalakis, Katharki, Liaskos, & Mantas, 2011; Hsu, Chang, Huang, & Chiang, 2011; Liu, 2011; Tsang, Chen, Wang, & Tai, 2012;
Ziersch & Baum, 2004). The main reasons for using PLS path modelling, in preference to MLR, in the current study, are outlined in Table 5.3, while the main reasons for using it in preference to SEM are outlined in Table 5.4.
Table 5.3 Comparison of Multiple Linear Regression vs. PLS Path Modelling
Multiple Linear Regression PLS Path Modelling Only one dependent/criterion
variable
Unlimited number of
dependent/criterion variables Assumes that the dependent/criterion
variable is measured at the interval/ratio level not collinear (not correlated with each other)
Collinearity between the
independent/predictor variables is tolerated
Collinearity inflates the standard errors of the regression coefficients
Collinearity does not inflate the standard errors of the regression coefficients
Assumes that the variance in the dependent variable is homogeneous across all the predictor variables
Homogeneity of variance is not assumed
Assumes that the variance is partitioned into the explained variance (due to the regression) vs.
the unexplained variance (due to sampling error)
The variance is not partitioned.
Assumes that all the variance is useful, and can be explained
Assumes that the residuals
(differences between the observed and predicted values) are normally distributed either side of their mean (zero) value
No residuals are computed
Type II errors (incorrect testing of the null hypotheses) may occur if the sample size is too small
Type II errors are not problematic because the sample size is assumed to be the whole population
Power analysis is necessary to determine the minimum sample size required to avoid Type II errors
Power analysis is not required to determine the minimum sample size
Table 5.4 Comparison of SEM vs. PLS Path Modelling
SEM PLS path modelling
The focus is on the strength of
conformity of the model with the data to explain the relationships between the variables
The focus is on predicting the relationships, and not explaining the relationships, between the variables
Assumes a multivariate normal
distribution of variables measured at the scale/interval level. Even small
departures from multivariate normality can compromise the statistical
inferences
No restrictions on the measurement or distributional characteristics of the variables
Extracts information from the covariance matrix. Comparing
variances and covariances is central to SEM
Does not extract information from the covariance matrix
Maximum likelihood estimation (MLE) is commonly used to fit the data to the model. MLE is based on maximizing the probability that the observed covariances are drawn from a
population assumed to be the same as that reflected in the coefficient estimates
Does not use MLE to fit the data to the model
The SEM process centres around two steps: validating the “measurement model”, and testing the goodness of fit of the “structural model”
The process centres around two steps:
validating the” measurement model”, and interpreting the “structural model”
Assumes that the variance in the structural model can be partitioned into the explained variance and the
unexplained variance (residual error)
Assumes that all the variance is useful, and can be explained. No concern for residual error
The data must fit a predefined model, indicated by goodness of fit statistics.
Goodness of fit tests determine if the model being tested should be accepted or rejected
No predefined model is assumed, and no goodness of fit statistics is used to determine if the model should be accepted or rejected
Acceptable goodness of fit measures are used to indicate convergent validity
No goodness of fit measures are used to indicate convergent validity
Discriminant validity is indicated by No modification index coefficients are
modification index coefficients computed to indicate discriminant validity Requires a large sample size (at least 10
to 20 cases for each measurement)
Does not require a large sample size (no minimum number of cases for each measurement)
Often fails to converge upon a solution, especially if the sample size is too small, and/or the variables contain a small number of values (e.g. ordinal measures with less than five ranks);
methodological problems arise in comparing variances and covariances
Never fails to converge upon a solution, even if the sample size is very small, and the variables contain a small number of values
PLS path modelling, in the same way as SEM, involves two stages: first, the construction of a measurement model (i.e. the computation of the latent variables from the indicator variables using factor analysis); and, second, the interpretation of the structural model (i.e. the relationships between the latent variables). Each latent variable is assumed to consist of one factor. Unlike SEM, however, PLS path modelling is generally viewed as an exploratory rather than a confirmatory method, implying that it is often used, as in the current study, to develop new models rather than to test hypotheses about the goodness of the fit of the data to existing models. There are no goodness of fit statistics in PLS path modelling. The main assumption of PLS path modelling, nonetheless, is that the latent variables are reliably measured (i.e. that the indicators are strongly inter-related to define a uni-dimensional factor, or unifying concept), and that each factor exhibits convergent validity (i.e. a high proportion of the variance is explained). Unlike SEM, however, PLS path analysis is robust, meaning that it can operate simultaneously on a large number of variables, with minimal assumptions about their distributional or measurement characteristics. PLS path modelling techniques rarely fail to produce a solution, mainly because PLS modelling is not restricted by violations of the theoretical assumptions of MLR and SEM. For these reasons, PLS path modelling has been described as “a magic bullet” (Hair, Ringle, & Sarstedt, 2011).
Smart-PLS (Ringle, Wende, & Will, 2005) was used to construct the PLS path models in the current study. The software, based on a GUI (graphic user interface), included tools to edit the layout of the path diagram, so that the PLS path analysis could be performed relatively quickly and easily (Temme et al., 2006). Figure 5.2 provides an example of the type of path diagram that can be constructed using the GUI interface of Smart-PLS. The diagram is used to explain the components of a PLS path model.
Figure 5.1 Example of a PLS path diagram constructed using the graphic user interface of Smart-PLS
The variables in the path diagram were functionally defined as either indicator variables or latent variables. The indicator variables, defined by rectangular symbols in Figure 5.2, were the individual item scores measured using the questionnaire. The variables were imported directly into Smart-PLS from a CSV (comma delimited) Microsoft Excel file. The latent variables and their constituent dimensions were defined by circular symbols. They were not measured by the researcher, but consisted of the principal component scores computed by SmartPLS using factor analysis. Each latent variable was assumed to consist of one factor.
The relationships between the indicators and the latent variables constituted the measurement model.
In Figure 5.1, the arrows drawn between the symbols specify the structural model in terms of the hypothetical cause and effect relationships. There were two types of relationships between the indicator and the latent variables: reflective and formative. A fan of arrows pointing out from a latent variable into a cluster of indicators represented a reflective relationship. A reflective relationship meant that the latent variable was assumed to be the common cause. The indicator variables reflected a wide range of inter-correlated effects, measured in terms of multiple item scores; they were measured with error, but exhibited internal consistency reliability. For example, Items d14 and d15 reflected the effects of the co-workers’ satisfaction (COWS). The indicator variable with an arrow pointing into a latent variable represented a formative relationship, meaning that the indicator variable was a causal factor. It was measured without error, and contributed towards the variance in the latent variable. However, it did not represent a cause or effect. For example, the experience and the ages of the workers, and the size of the PHCCs were included as hypothetical formative indicators of job satisfaction (JOBSAT).
An arrow pointing out from one latent variable to another represented a causal path between a hypothetical cause and a predicted effect. For example, the arrow flowing out of the
organisational culture (ORGAN) variable, into the job satisfaction (JOBSAT) variable, implied that the organisational culture may be the hypothetical cause, whilst job satisfication may be the predicted effect. The fan of arrows flowing out of JOBSAT variable implied that job satisfaction was the common cause, whilst the following variables were reflected as the hypothetical effects of job satisfaction: control and responsibility (COAR), scheduling (SHED), interaction and opportunities (INOP), professional opportunities (PROP), extrinsic (EXTR), co-workers (COWS), praise and recognition (PRAR), and balance of family and work (BOFW).
5.1.8.1 Validation of the Measurment Model
The first stage of the modeling process validated the measurement model, by evaluating the factor loadings (including the cross loadings), the average variance explained (AVE) (which were the same as the communalities), and the Conbach’s alpha reliability coefficients (which were tabulated in the SmartPLS output).
5.1.8.2 Evaluation of the Structural Model
The second stage evaulated the measurement model, by interpreting the path coefficients and the R2 values that were outputed directly onto the path diagram. The path coefficients defined how much of the multidimensional variance was partitioned between the latent variables and their respective dimensions. The magnitude of each path coefficient indicated the relative strength and direction (positive or negative) of the relationships between the variables. Each path coefficient measured the partial correlation between the two latent variables after the joint correlations between all the other variables had been removed, or "partialled out"
(Haenlein & Kaplan, 2004). If the root cause of the correlation between the two variables was their joint correlation with another variable; then, the partial correlation was reduced in magnitude. The path coefficients were standardised to take into account the different units of
measurement for each variable; consequently, they ranged from -1 to +1, and were interpreted in the same way as the standardized regression coefficients in the multiple regression equation. The R2 values measured the magnitude of the effects, indicating the proportions of the variance explained.
The statistical significance of the path coefficients and R2 values were estimated by bootstrapping; this involved drawing 1000 random samples, repeatedly, from the data matrix with 100 cases in each sample. The mean and the standard error of each path coefficient and R2 value were computed. A series of one sample t test was then conducted to test the hypothesis that the mean value of each path coefficient, and the mean value of each R2 value, was significantly different from zero at the conventional α = .05 level of significance.