CAPÍTULO IV: ANÁLISIS DE RESULTADOS
4.1. METODOLOGÍA, GUÍA Y/O PROCEDIMIENTO DE IMPLEMENTACIÓN
The hypotheses established for this study were tested using the Structural Equation Modelling (SEM) methodology, and in particular, the Partial Least Squares (PLS) approach. SEM is a relatively new approach for assessing multivariate models with empirical data and was developed by Joreskog in the 1970s (Chin, 1998b). One reason for the increased use of SEM among researchers is the ability to simultaneously examine theory and measures (Chin & Newsted, 1999).
SEM, a second generation multivariate analysis tool (Bagozzi & Fornell, 1982), incorporates an economic perspective focused on prediction and a psychometric approach that models concepts as latent variables that are indirectly inferred from multiple observed measures (Barroso et al., 2010).101 This approach allows researchers (Fornell, 1982; Chin, 1998a; and Haenlein & Kaplan, 2004) to:
(1) explicitly model measurement error for observed variables;
(2) incorporate abstract and unobservable constructs (latent variables) measured by indicators;
(3) simultaneously model relationships among multiple predictor and criterion variables; and
(4) combine and test a priori knowledge and hypotheses with empirical data.
There are two stages to the SEM analysis: the measurement model, and a structural model assessment (Hair et al., 2006; and Barroso et al., 2010).102 The measurement model linking observed variables to their associated constructs is assessed by examining whether the theoretical constructs are correctly measured by the manifest variables (indicators), with
101
Latent variables are also commonly referred to as constructs, unobserved variables or factors and measures are also commonly referred to as indicators, manifest variables or items.
102
An illustration of the relationship between the measurement model and the structural model is presented in Figure 5.2.
134
reference to reliability and validity attributes. In contrast, the structural model linking the constructs is assessed according to the meaningfulness and significance of the hypothesised relationships between the constructs (Barroso et al., 2010). The following Figure 5.2 demonstrates these concepts. The latent variable can be described as an unobserved
variable implied by the covariance among the measured block of indicators x11, x21 and x31.
Likewise, latent variables and are measured by their associated observed measures,103
and together the three latent variables and their associated indicators represent three measurement models. The structural model represented in the middle square prescribes the relations among the latent variables ( In other words, each latent variable (or circle) represents a construct, and each indicator (small boxes) represents a measure (or manifest variable measuring its associated construct), while the arrows between the latent variables (between the circles) represent the path coefficients measuring the relationships between these constructs. Details of the measurement and structural models are discussed in section 5.5.3.
Figure 5.2: Measurement and Structural Models (Reproduced from Chin, 2009)
SEM enables the evaluation of the measurement and structural models in a single systematic and comprehensive analysis (Gefen et al., 2000; and Barroso et al., 2010). This combined analysis of the measurement and structural model allows measurement errors of the observed variables to be analysed as an integral part of the model and factor analysis to be combined in one operation with the hypotheses testing (Gefen et al., 2000).
103
The measures (or indicators) for the latent variable are x12 and x22 and for the latent variable include x13,
135
Equally important is SEM‟s ability to express complex variable relationships through hierarchical or non-hierarchical, and recursive or non-recursive structural equations to present a more complete picture of the entire model (Hanushek & Jackson, 1977; Bullock et al., 1994; Gefen et al., 2000; and Barroso et al., 2010). These complex causal networks enabled by SEM characterise real world processes better than simple correlation-based models. Therefore, SEM is more suited for the mathematical modelling of complex processes to serve both theory (Bollen, 1989) and practice (Dubin, 1976; Gefen et al., 2000; and Barroso et al., 2010).
The two common but distinct statistical techniques of SEM are the covariance-based SEM (Joreskog, 1973; Bollen, 1989; and Rigdon, 1998) and PLS which is a component or variance-based method (Wold, 1980a; 1982; 1985). These two techniques differ in the objectives of their analyses, the statistical assumption on which they are based, and the nature of the fit statistics each produce (Barroso et al., 2010). This is further discussed in the section that follows.
(a) Covariance-Based Structural Equation Modelling and Partial Least
Squares Techniques
Advances in causal modelling which enable researchers to simultaneously study theory and measures have increased significantly. However, despite the increased use of SEM, most readers and reviewers of research articles are still more familiar with the Covariance-Based Structural Equation Model (CBSEM) methods than the PLS approach (Barrosa et al., 2010; Gotz et al., 2010; and Chin, 2010). The increasing interest in SEM analysis, especially among social science researchers, creates the need for making comparisons between various SEM techniques (Chin, 2010). Chin (2010) further contends that researchers using PLS path analyses are obliged to provide some initial discussion as to the rationale for applying the PLS method. This section will therefore include a comparison of the attributes, underlying assumptions and limitations of the CBSEM and PLS methods, and a discussion on the rationale for employing the PLS approach (as opposed to the CBSEM method).
The CBSEM and PLS approaches to data analyses are quite distinct in that each of these methods differ in terms of their objectives, statistical assumptions and the nature of the fit statistics they produce (Gefen et al., 2000; Barroso et al., 2010; and Turkyilmaz et al., 2010).
(i) Objective/Approach
The objectives of CBSEM and PLS are quite distinct. Whereas CBSEM aims to estimate the parameters of the model (for example, the loadings and path values) in order to minimise
136
the difference between the sample covariance and those predicted by the theoretical model (Barroso et al., 2010), PLS on the other hand focuses on the prediction of the dependent variables (both latent and manifest) by maximising the explained variance (R2) of the dependent variables.
Therefore, while the parameter estimation process of CBSEM tries to reproduce the covariance matrix of the observed measures‟ overall goodness of fit (Chin and Newsted, 1999) to see how well the hypothesised model fits the data (Barclay et al., 1995), the parameter estimates for PLS are obtained based on the ability to minimise the residual variances for dependent variables. PLS is therefore more suited than CBSEM for predictive applications and theory building (exploratory analysis), although PLS can also be used for theory confirmation (confirmatory analysis) (Barroso et al., 2010).
(ii) Assumptions
Whereas a CBSEM approach rests on the assumptions of a specific multivariate distribution and independence of observations, the PLS approach does not make these hard assumptions. Instead, PLS uses very general, soft distributional assumptions, which often lead to this approach being termed „soft modelling‟ (Wold, 1980b; and Chin, 2010). Although the mathematical and statistical procedures are rigorous and robust (Wold, 1980a), the mathematical model is „soft‟ in the sense that it makes no measurement, distributional or sample size assumptions (Barroso et al., 2010). CBSEM is only efficient and unbiased when the assumption of multivariate normality is met (Gotz et al., 2010).
(iii) Parameter Estimates
As a full information approach, model misspecification can have a significant impact on the estimates obtained throughout the CBSEM model (Chin, 2010).104 In contrast, the limited estimation procedure of PLS (whereby estimates are limited to the immediate blocks a particular construct is structurally connected to), means that misspecification in one part of a model will have less influence on the parameter estimates in other parts of the model.
(iv) Latent Variable Scores
In contrast to CBSEM, PLS avoids problems associated with inadmissible solutions and factor indeterminacy (Fornell & Bookstein, 1982; and Chin & Newsted, 1999). This is
104
For example, adding an item that does not belong to a particular construct can impact estimates obtained throughout the model.
137
because the constructs in CBSEM are modelled as indeterminate while in PLS the constructs are modelled as determinate.105
(v) Epistemic Relationship between a Latent Variable and its Measures
In terms of epistemic relationships, CBSEM was designed to operate with reflective indicators (Fornell, 1982), and any attempts to include formative indicators in the model could lead to identification problems, implied covariance of zero among indicators, and/or the existence of equivalent models (MacCullum & Browne, 1993).106 In contrast, PLS allows working with both formative and reflective indicators (Fornell & Bookstein, 1982).
(vi) Model Complexity
PLS models have the capacity to handle very complex models, with a high number of constructs, indicators and relationships. In contrast, CBSEM runs into difficulties handling larger models with 50 or more items (Barclay et al., 1995; Chin & Newsted, 1999; and Chin, 2010).
(vii) Implication
CBSEM is considered to provide optimal estimates of the model parameters, and is ideal for model confirmation and estimation of the “true” underlying population parameters. The PLS approach on the other hand is arguably more suitable for prediction accuracy (Chin & Newsted, 1999).
(viii) Sample Size
The sample size requirement for CBSEM ranges from between 200 to 800. In comparison, PLS‟s sample size requirement for complex models is smaller, ranging from 30 to 100 cases (Chin & Newsted, 1999). In addition, the sample size for PLS can be small relative to the complexity of the model (Chin, 2010).
In terms of the directional relationship among constructs, CBSEM allows for both recursive (unidirectional) and nonrecursive (bidirectional) relationships. In contrast, PLS currently only works with recursive relationships (Barroso et al., 2010).
105
A determinate construct is a composite of its indicators. An indeterminate construct is a composite of its indicators plus an error term (Fornell, 1982, p.5).
106
138
Despite the differences identified above, Wold (1985) suggested that both CBSEM and PLS should be considered as complementary rather than competitive methods, both having rigorous rationale of their own. A summary of the key differences discussed above is presented in Table 5.1.
Table 5.1: Comparison between PLS and CBSEM Methodology (Reproduced from Chin and Newsted, 1999, p.314)
Criterion PLS CBSEM
Objective Prediction oriented Parameter oriented
Approach Variance based Covariance based
Assumptions Prediction specification (nonparametric)
Typically multivariate normal distribution and independent observations (parametric) Parameter estimates Consistent as indicators and
sample size increase (for example, consistency at large)
Consistent
Latent variables scores Explicitly estimated Indeterminate Epistemic relationship
between a latent variable and its measures
Can be modelled in either formative or reflective mode
Typically only with reflective indicators
Implications Optimal for prediction accuracy
Optimal for parameter accuracy
Model complexity Large complexity (e.g., 100 constructs and 1000 indicators)
Small to modest complexity (e.g., less than 100 indicators)
Sample size requirements Power analysis based on the portion of the model with the largest number of predictors - minimal recommendations range from 30 to 100 cases
Ideally based on power analysis of specific model – minimal recommendations range from 200 to 800
(b) Reasons for Using PLS-Graph
The discussion in the previous section clearly demonstrates the advantages of employing the PLS approach for this study. The main objective of this study is to predict tax compliance behaviour using the PLS approach, which is prediction oriented, and offers better prediction capability. As an incremental study which builds on prior theory by developing new measures and structural paths, the PLS approach with its limited estimation procedure (whereby estimates are limited to the immediate blocks a particular construct is structurally connected to), offers better protection against model misspecification. Any misspecification in one part of the model would have less influence on the parameter estimates in other parts of the model. Equally important is the fact that the sample size from the survey is relatively small (under
139
200 cases), which is not considered suitable for the CBSEM method, which requires over 200 cases. PLS, with its minimal recommendation range of 30 to 100 cases, and its soft distributional assumptions, is considered suitable for this study. Another reason for selecting PLS is the ease of model specification and reduction in complexity regarding model identification. The PLS-Graph Version 3 used for this study is a relatively easy tool to use. Finally, the PLS approach has rarely been applied in tax compliance behaviour, and one of the objectives of this study is to use PLS to test the tax compliance model and in the process prove that PLS can be successfully used in tax compliance studies. The next section sets out the process adopted in evaluating the PLS model.